CN113382279B

CN113382279B - Live broadcast recommendation method, device, equipment, storage medium and computer program product

Info

Publication number: CN113382279B
Application number: CN202110662041.7A
Authority: CN
Inventors: 陈敏
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2022-11-04
Anticipated expiration: 2041-06-15
Also published as: CN113382279A

Abstract

The disclosure provides a live broadcast recommendation method, a live broadcast recommendation device, live broadcast recommendation equipment, storage media and computer program products, and relates to the technical field of internet, in particular to the field of intelligent recommendation. The specific implementation scheme is as follows: intercepting a live video to obtain a current video clip; performing content identification on the current video clip to obtain a current tag set; updating a historical tag set of the live video based on the current tag set to obtain a live content tag set; and recommending the live video according to the live content tag set. Related labels can be updated in time according to live content, and the recommendation accuracy is improved.

Description

Live broadcast recommendation method, device, equipment, storage medium and computer program product

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a live broadcast recommendation method, apparatus, device, storage medium, and computer program product.

Background

With the rapid development of internet technology, watching various live videos becomes an important way for people to obtain information. However, the number of live broadcast videos is huge, and the related contents are various and complicated, so that a user cannot timely and accurately acquire the required live broadcast contents.

Disclosure of Invention

The present disclosure provides a live broadcast recommendation method, apparatus, device, storage medium, and computer program product, which improve accuracy of live broadcast recommendation.

According to an aspect of the present disclosure, there is provided a live broadcast recommendation method, including: intercepting a live video to obtain a current video clip; performing content identification on the current video clip to obtain a current tag set; updating a historical tag set of the live video based on the current tag set to obtain a live content tag set; and recommending the live video according to the live content tag set.

According to another aspect of the present disclosure, there is provided a live broadcast recommendation apparatus including: the intercepting module is configured to intercept the live video to obtain a current video segment; the identification module is configured to identify the content of the current video clip to obtain a current tag set; the updating module is configured to update a historical tag set of the live video based on the current tag set to obtain a tag set of live content; and the recommending module is configured to recommend the live video according to the live content tag set.

According to still another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; the storage stores instructions executable by at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the live broadcast recommendation method.

According to yet another aspect of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the live recommendation method is provided.

According to yet another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the live recommendation method described above.

It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram in which the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a live recommendation method according to the present disclosure;

fig. 3 is a flow diagram of another embodiment of a live recommendation method according to the present disclosure;

FIG. 4 is a flow diagram of yet another embodiment of a live recommendation method according to the present disclosure;

FIG. 5 is a flow diagram of yet another embodiment of a live recommendation method according to the present disclosure;

FIG. 6 is a flow diagram of yet another embodiment of a live recommendation method according to the present disclosure;

fig. 7 is a schematic structural diagram of an embodiment of a live recommendation device according to the present disclosure;

fig. 8 is a block diagram of an electronic device for implementing a live recommendation method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of embodiments of the present disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the live recommendation method or live recommendation apparatus of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may interact with a server 105 over a network 104 using

terminal devices

101, 102, 103 to obtain recommended live video, etc. Various client applications, such as a video live application, etc., may be installed on the

terminal devices

101, 102, 103.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the

terminal devices

101, 102, 103 are software, they can be installed in the electronic devices described above. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may provide various live video-based services. For example, the server 105 may analyze and process live videos acquired from the

terminal apparatuses

101, 102, 103, and generate a processing result (e.g., recommendation of live videos, etc.).

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the live recommendation method provided in the embodiment of the present disclosure is generally executed by the server 105, and accordingly, the live recommendation apparatus is generally disposed in the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a live recommendation method according to the present disclosure is shown. The live broadcast recommendation method comprises the following steps:

step 201, intercepting the live video to obtain the current video clip.

In this embodiment, an execution subject (for example, the server 105 shown in fig. 1) of the live broadcast recommendation method may detect a live broadcast state of a live broadcast room, and if it is detected that a live broadcast starts to be performed in the live broadcast room, may obtain a video stream of a current live broadcast, and intercept the video stream. In general, a video stream of a fixed duration can be intercepted as a current video clip of a live video according to actual needs or computing power of a model.

Step 202, identifying the content of the current video clip to obtain a current tag set.

In this embodiment, after the executing entity obtains the current video segment, the executing entity may further identify the live content related to the current video segment, for example, the current video segment may be input into a content identification model trained in advance, and a corresponding identification result is output by the model. In general, the recognition result may include a plurality of keywords associated with the current live content, each of which may be referred to as a video tag. All video tags obtained by content identification may constitute the current tag set.

And 203, updating the historical tag set of the live video based on the current tag set to obtain a live content tag set.

In this embodiment, after obtaining the current tab set, the executing entity may further update the historical tab set of the live video by using the current tab set. The historical tag set may be a tag set corresponding to all live videos before the current video segment in the live broadcast process. The history tag set may be identified by the category identification model, or may be determined based on information such as the type of live broadcast room or the past live broadcast history of the anchor. When the historical tag set is updated, the tags in the historical tag set can be corrected by taking the tags in the current tag set as a reference standard, so that the live broadcast content tag set is obtained.

For example, a main broadcast often live content is a math course explaining elementary school students, and the historical tag set may include a tag "elementary school math", and after content recognition is performed on the current video clip, the obtained current tag set may include a tag "twenty plus or minus method". Through big data analysis, learning content of primary school grade mathematics corresponding to within twenty plus-minus methods can be known, and then historical labels of live broadcast videos, namely primary school mathematics, can be updated based on current labels, namely within twenty plus-minus methods, so that labels, namely primary school grade mathematics, can be obtained and serve as one label in the live broadcast content label set.

And step 204, recommending the live video according to the live content tag set.

In this embodiment, after obtaining the live content tag set, the execution main body may recommend the live video to the user interested in the live content tag set according to the specific text content of each tag in the live content tag set. For example, a tag of a live video that a user has viewed in the past may be acquired as an interest tag of the user, and then the live video recommended to the user may be determined according to a matching degree of a live content tag set and the user interest tag.

The live broadcast recommendation method provided by the embodiment of the disclosure includes the steps of firstly intercepting a live broadcast video to obtain a current video clip, then carrying out content identification on the current video clip to obtain a current tag set related to live broadcast content, then updating a historical tag set of the live broadcast video based on the current tag set to obtain a live broadcast content tag set, and finally recommending the live broadcast video according to the live broadcast content tag set. The live broadcast video tags are updated in time through the latest live broadcast content, so that the timeliness and the accuracy of the live broadcast tags are obviously improved, and the accuracy of live broadcast recommendation is improved.

With further continued reference to fig. 3, a flow 300 of another embodiment of a live recommendation method according to the present disclosure is shown. The live broadcast recommendation method comprises the following steps:

step 301, intercepting the live video to obtain the current video clip.

Step 302, performing content identification on the current video clip to obtain a current tag set.

In this embodiment, the specific operations of steps 301 to 302 have been described in detail in steps 201 to 202 in the embodiment shown in fig. 2, and are not described herein again.

And 303, acquiring a history label set of the live video.

In this embodiment, after obtaining the current tab set, the executing entity may first obtain a historical tab set of the live video. The historical tag set may be a tag set corresponding to all live videos before the current video segment in the live broadcast process. The history tag set may be identified by the category identification model, or may be determined based on information such as the type of live broadcast room or the past live broadcast history of the anchor.

And 304, merging and de-duplicating all the labels in the current label set and the historical label set to obtain a live broadcast content label set.

In this embodiment, after obtaining the historical tag set of the live video, the executing body may directly merge all tags in the current tag set and the historical tag set, and perform deduplication on all merged tags, thereby finally obtaining a live content tag set. The deduplication operation can be implemented by calculating the similarity between the labels, so that only one similar label is reserved in the live content label set.

And 305, recommending the live video according to the live content tag set.

In this embodiment, the specific operation of step 305 has been described in detail in step 204 in the embodiment shown in fig. 2, and is not described herein again.

As can be seen from fig. 3, compared with the embodiment corresponding to fig. 2, in the live broadcast recommendation method in this embodiment, after the current tag set is obtained, the historical tag set is updated in a manner of merging and deduplicating the two tag sets, so that the obtained live broadcast content tag set is comprehensive and accurate, and the accuracy of live broadcast recommendation is further improved.

With further continued reference to fig. 4, a flow 400 of yet another embodiment of a live recommendation method according to the present disclosure is shown. The live broadcast recommendation method comprises the following steps:

step 401, intercepting the live video to obtain a current video segment.

And 402, identifying the content of the current video clip to obtain a current label set.

And step 403, acquiring a historical label set of the live video.

In this embodiment, the specific operations of steps 401 to 403 have been described in detail in steps 301 to 303 in the embodiment shown in fig. 3, and are not described herein again.

And step 404, acquiring a weight coefficient of each label in the historical label set.

In this embodiment, after obtaining the history tag set of the live video, the execution subject may further obtain a weight coefficient of each tag in the history tag set. For example, when tagging a video, all tags are arranged in a certain order, and the more the tags arranged in the front have higher correlation with the video content, the larger the weighting factor, that is, the weighting factor of the tags may be proportional to the arrangement order.

In some optional implementations of this embodiment, step 404 may specifically include: acquiring the accumulated duplication removal times of all the labels in the historical label set; the weight coefficient is determined based on the accumulated number of deduplication times. Specifically, the historical tag set may be obtained by identifying a plurality of historical video clips, or may be obtained according to related information such as a main broadcast feature of a live video and a feature of a user watching the live video. In the process of acquiring the historical label set, all labels obtained for multiple times or multiple aspects can be merged and deduplicated, and the accumulated deduplicated number of each label is recorded. The more the number of deduplication times is, the more the number of times the tag appears is, and the higher the correlation with the live video is. Thus, a reasonable weighting factor may be determined based on the cumulative number of deduplication times, for example, the weighting factor may be proportional to the cumulative number of deduplication times.

And 405, combining and removing the duplicate of all the labels in the current label set and the historical label set to obtain a candidate label set, and recording the duplicate removal times of all the labels in the candidate label set.

In this embodiment, after obtaining the historical tag set of the live video, the executing entity may further merge all tags in the current tag set and the historical tag set, and perform a deduplication operation on all merged tags to obtain a candidate tag set. Meanwhile, the de-duplication times of the labels in the candidate label set can be recorded. Since only two tag sets are merged in this step, that is, the current tag set and the historical tag set, the deduplication times of the tags in the merging process may be 0 or 1.

And 406, screening a live broadcast content label set from the candidate label set based on the weight coefficient and the de-weight times.

In this embodiment, after obtaining the candidate tag set, the execution main body may further filter the tags. Specifically, all tags in the candidate tag set are derived from two parts, namely a historical tag set and a current tag set, and for the tags derived from the historical tag set, the tags have two parameters, namely a weight coefficient and a de-duplication number; tags from the current tag set have only one parameter for deduplication. In tag screening, two parameters, i.e., the weighting factor and the number of times of deduplication, may be considered together, and for example, the weighting factor may be corrected by the number of times of deduplication. For a label with a weight coefficient, if the de-weight times is 1, the weight coefficient is increased according to a pre-made weight rule, and if the de-weight times is 0, the weight coefficient is kept unchanged. For tags that do not have a weight coefficient, a weight coefficient corresponding to the number of deduplication times may be set according to the above-described weight rule. And then sequencing all the labels according to the corrected weight coefficient, and taking a certain number of the labels arranged in the front as a live content label set.

And step 407, recommending the live video according to the live content tag set.

In this embodiment, the specific operation of step 407 has been described in detail in step 204 in the embodiment shown in fig. 2, and is not described herein again.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 3, in the live broadcast recommendation method in this embodiment, when merging and de-weighting are performed on all tags in the current tag set and the historical tag set, the weight coefficient of each tag in the historical tag set is first obtained, then merging and de-weighting are performed on all tags in the current tag set and the historical tag set to obtain a candidate tag set, the de-weighting times of each tag in the candidate tag set are recorded, and finally, a live broadcast content tag set is screened out from the candidate tag set based on the weight coefficient and the de-weighting times. Therefore, the labels with low relevance to the live broadcast content can be filtered, the calculation amount during recommendation is reduced, and the live broadcast recommendation efficiency is further improved.

With further continuing reference to fig. 5, a flow 500 of yet another embodiment of a live recommendation method in accordance with the present disclosure is illustrated. The live broadcast recommendation method comprises the following steps:

step 501, intercepting the live video according to a first time interval to obtain a current video clip.

In this embodiment, the live broadcast process may be continuously performed in real time, and after detecting that a live broadcast starts, the execution main body may continuously acquire a live broadcast video stream, and intercept the obtained video stream according to a preset first time interval. The duration of each intercepted video segment may be equal. Alternatively, the duration of the video segment may be equal to the first time interval. And in the process of intercepting the live video, the obtained latest video clip is the current video clip.

Step 502, identifying the content of the current video clip to obtain a current label set.

In this embodiment, the specific operation of step 502 is described in detail in step 202 in the embodiment shown in fig. 2, and is not described herein again.

And 503, acquiring a video clip before the current video clip based on the intercepted time sequence.

In this embodiment, after obtaining the current tag set, the executing entity needs to obtain a video clip that is previous to the current video clip. Since the capturing of the live video is continuously performed according to the first time interval in step 501, all the obtained video segments may be arranged according to the capturing sequence, so that the video segment located in front of the current video segment is the previous video segment.

In this step, if the previous video segment is acquired, the following step 504 is performed, otherwise, the following step 505 is performed.

And step 504, responding to the successful acquisition of the previous video clip, and taking the live broadcast content tag set of the previous video clip as a historical tag set.

In this embodiment, if the execution main body successfully acquires the previous video segment, which indicates that the current video segment is not the first captured segment, one or more historical video segments still exist before the current video segment, and at this time, a live content tag set of the segment may be further acquired and used as the historical tag set.

And 505, in response to the failure of the previous video clip acquisition, taking the basic tag set of the live video as a historical tag set.

In this embodiment, if the executing entity fails to acquire the previous video segment, which indicates that the current video segment is the first captured segment, and there is no history video segment before the current video segment, at this time, the basic tag set of the live video may be acquired as the history tag set.

In some optional implementations of this embodiment, the basic tag set may be obtained by: acquiring text description information and a cover page picture of a live video; extracting keywords from the text description information to obtain a text label; performing character recognition on the cover picture to obtain a cover label; and combining the text label and the cover label to obtain a basic label set.

Specifically, the execution main body may first obtain text description information and a cover page image of a live video, where the text description information may include text information such as a live title and a live content introduction, and the cover page image may be a display image of each live room on a live platform. Then, the execution subject performs semantic analysis on the text description information, the obtained keywords are used as text labels, and meanwhile, the cover picture can be subjected to Character Recognition through an OCR (Optical Character Recognition) technology to obtain cover labels. Finally, the execution body can combine and de-duplicate the text label and the cover label to obtain a basic label set. For a specific merging and deduplication method, reference may be made to the above description of merging and deduplication of the current tag set and the historical tag set, which is not described herein again. By identifying the text description information and the cover picture, an accurate basic tag set can be obtained, and an accurate reference is laid for the generation of a tag set of the live broadcast content later.

And step 506, merging and de-duplicating all the labels in the current label set and the historical label set to obtain a live broadcast content label set.

And 507, recommending the live video according to the live content tag set.

In this embodiment, the specific operations of steps 506-507 have been described in detail in steps 304-305 in the embodiment shown in fig. 3, and are not described herein again.

It should be noted that each video clip captured in this embodiment may correspond to a live content tab set. For the first video clip intercepted, merging and de-duplicating all the tags in the current tag set and the basic tag set of the clip, so as to obtain the corresponding live broadcast content tag set. For other non-first video clips, merging and de-duplicating the current tag set of the clip and the live content tag set of the previous video clip to obtain the current corresponding live content tag set.

As can be seen from fig. 5, compared with the embodiment corresponding to fig. 3, the live broadcast recommendation method in this embodiment firstly intercepts a live broadcast video according to a first time interval, and after obtaining a current tag set, obtains a video clip that is previous to the current video clip based on the intercepted time sequence. And responding to the success of the acquisition of the previous video clip, taking the live content tag set of the previous video clip as a history tag set, otherwise taking the basic tag set of the live video as the history tag set, and then carrying out the steps of merging, removing duplicate, recommending and the like. By introducing the basic tag set, the accuracy of the live content tag set can be further improved, and the recommendation efficiency is improved.

With further continued reference to fig. 6, a flow 600 of yet another embodiment of a live recommendation method according to the present disclosure is shown. The live broadcast recommendation method comprises the following steps:

and 601, intercepting the live video to obtain a current video segment.

In this embodiment, the specific operation of step 601 has been described in detail in step 201 in the embodiment shown in fig. 2, and is not described herein again.

Step 602, performing content identification on a key frame included in a current video clip to obtain a first tag set.

In this embodiment, after obtaining the current video clip, the executing entity may first extract a key frame in the video clip, then perform content recognition on the key frame through an OCR technology, extract all content tags on each key frame, and form a first tag set. The key frames may be obtained by a motion analysis or video clustering method, which belongs to the known technology in the art and will not be described herein again.

Step 603, performing content identification on the voice information included in the current video clip to obtain a second tag set.

In this embodiment, after obtaining the current video segment, the execution main body may extract the voice information of the current video segment, then extract the key content in the voice information through Automatic Speech Recognition (ASR) and a sequence tagging algorithm, and use the obtained keyword as a content tag to form a second tag set.

And step 604, combining the first label set and the second label set to obtain a current label set.

In this embodiment, after obtaining the first tag set and the second tag set, the executing entity may directly merge all tags in the two tag sets, and perform a deduplication operation on all merged tags, so as to finally obtain the current tag set. The deduplication operation may be implemented by calculating similarity between tags, so that only one similar tag is retained in the current tag set.

Step 605, updating the historical tag set of the live video based on the current tag set to obtain a tag set of the live content.

In this embodiment, the specific operation of step 502 has been described in detail in step 202 in the embodiment shown in fig. 2, and is not described herein again.

And step 606, acquiring characteristic information of the target object.

In this embodiment, the target object may be a registered user on the live platform, and the feature information may include information such as video preference of the user. The execution subject may determine feature information of the target object according to the registration information and the historical behavior information of the target object.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

Step 607, determining a anchor feature tag set based on the anchor's live history of the live video.

In this embodiment, the execution subject may obtain a anchor feature tag set related to an anchor, where the tag set is related to a live history of the anchor. For example, the anchor feature tag set may be determined from the title and profile of each previous live broadcast, or the live content tag set obtained by the above steps may be stored as a subset of the anchor feature tag set.

And 608, correcting the live broadcast content label set through the anchor characteristic label set.

In this embodiment, after obtaining the anchor feature tag set, the executing entity may further correct the live content tag set of the live video by using the anchor feature tag set. Since the anchor feature tag set covers a large amount of historical live information, its accuracy is usually higher than the tags obtained in a live broadcast. If the tag contents in the anchor characteristic tag set and the live content tag set conflict, the anchor characteristic tag can be preferentially selected to replace the live content tag to serve as the corrected live content tag.

And 609, recommending the live broadcast video to the target object in response to the fact that the corrected live broadcast content tag set is matched with the characteristic information.

In this embodiment, after obtaining the corrected live content tag set, the execution main body may calculate a similarity between the live content tag set and the feature information, and when the similarity is greater than a preset threshold, it may be considered that the live content tag set matches the feature information, and at this time, the live video may be recommended to a target object corresponding to the feature information.

As can be seen from fig. 6, compared with the embodiment corresponding to fig. 2, in the live broadcast recommendation method in this embodiment, a current tag set is obtained by performing content identification on a key frame and voice information of a current video segment, and a live broadcast content tag set is corrected by using a anchor feature tag set. The accuracy of the live content tag set can be further improved, and the recommendation accuracy is improved.

With further reference to fig. 7, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides an embodiment of a live broadcast recommendation apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 7, the live recommendation apparatus 700 of this embodiment may include an intercepting module 701, an identifying module 702, an updating module 703 and a recommending module 704. The capturing module 701 is configured to capture a live video to obtain a current video segment; an identifying module 702 configured to perform content identification on a current video segment to obtain a current tag set; an updating module 703 configured to update a historical tag set of the live video based on the current tag set to obtain a tag set of the live content; a recommendation module 704 configured to recommend the live video according to the live content tag set.

In this embodiment, the live recommendation apparatus 700: the specific processing and the technical effects of the intercepting module 701, the identifying module 702, the updating module 703 and the recommending module 704 can refer to the related descriptions of steps 201 to 204 in the corresponding embodiment of fig. 2, which are not described herein again.

In some optional implementations of this embodiment, the updating module 703 includes: the tag acquisition sub-module is configured to acquire a historical tag set of the live video; and the first merging submodule is configured to merge and deduplicate all the tags in the current tag set and the historical tag set to obtain a live content tag set.

In some optional implementations of this embodiment, the first merging submodule includes: a weight obtaining unit configured to obtain a weight coefficient of each label in the history label set; the label merging unit is configured to merge and deduplicate all labels in the current label set and the historical label set to obtain a candidate label set, and record the deduplicating times of all the labels in the candidate label set; and the label screening unit is configured to screen a live content label set from the candidate label set based on the weight coefficient and the de-weighting times.

In some optional implementation manners of this embodiment, the weight obtaining unit includes: the times obtaining subunit is configured to obtain the accumulated duplication removing times of all the labels in the historical label set; a weight determination subunit configured to determine a weight coefficient based on the accumulated deduplication number.

In some optional implementations of this embodiment, the intercepting module 701 includes: and the interception submodule is configured to intercept the live video according to the first time interval. And the tag acquisition submodule comprises: a section acquisition unit configured to acquire a video section preceding the current video section based on the intercepted time sequence; a first determining unit, configured to take a live content tag set of a previous video segment as a history tag set in response to a successful acquisition of the previous video segment; and the second determination unit is configured to take the basic label set of the live video as the historical label set in response to the failure of the previous video clip acquisition.

In some optional implementations of this embodiment, the base labelset is obtained by the following sub-units: the information acquisition subunit is configured to acquire text description information and a cover page image of the live video; the text extraction subunit is configured to extract keywords from the text description information to obtain a text label; the image identification subunit is configured to perform character identification on the cover image to obtain a cover label; and the label merging subunit is configured to merge the text label and the cover label to obtain a basic label set.

In some optional implementations of this embodiment, the identifying module 702 includes: the first identification submodule is configured to perform content identification on key frames included in a current video clip to obtain a first label set; the second identification submodule is configured to perform content identification on voice information included in the current video clip to obtain a second tag set; and the second merging submodule is configured to merge the first label set and the second label set to obtain a current label set.

In some optional implementations of this embodiment, the recommending module 704 includes: the characteristic acquisition sub-module is configured to acquire characteristic information of the target object; a feature determination submodule configured to determine a anchor feature tag set based on a live broadcast history of an anchor of a live broadcast video; a tag correction submodule configured to correct the live content tag set by a anchor feature tag set; and the recommending submodule is configured to recommend the live video to the target object in response to the corrected live content tag set being matched with the characteristic information.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806 such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The computing unit 801 executes the respective methods and processes described above, such as the live recommendation method. For example, in some embodiments, the live recommendation method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When the computer program is loaded into RAM803 and executed by computing unit 801, one or more steps of the live recommendation method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the live recommendation method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a server of a distributed system or a server incorporating a blockchain. The server can also be a cloud server, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology. The server may be a server of a distributed system or a server incorporating a blockchain. The server can also be a cloud server, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A live recommendation method, the method comprising:

intercepting a live video to obtain a current video clip;

performing content identification on the current video clip to obtain a current tag set;

updating a historical tag set of the live video based on the current tag set to obtain a live content tag set, wherein the historical tag set is a tag set corresponding to all live videos before the current video clip in the current live broadcasting process;

recommending the live broadcast video according to the live broadcast content tag set;

wherein updating the historical tag set of the live video based on the current tag set to obtain a live content tag set comprises:

correcting the labels in the historical label set by taking the labels in the current label set as a reference standard to obtain the live broadcast content label set;

wherein recommending the live video according to the live content tab set comprises:

acquiring characteristic information of a target object, wherein the characteristic information of the target object comprises a target object interest tag, and the target object interest tag comprises a tag of a live video watched by the target object in the past;

determining a anchor characteristic tag set based on the anchor live history of the live video;

correcting the live content tag set through the anchor characteristic tag set;

recommending the live video to the target object in response to the corrected live content tag set matching the feature information;

wherein the recommending the live video to the target object in response to the corrected live content tab set matching the feature information comprises:

recommending the live video to the target object in response to the corrected live content tag set matching the target object interest tag;

wherein updating the historical tag set of the live video based on the current tag set to obtain a tag set of live content further comprises:

acquiring a historical label set of the live video;

and merging and de-duplicating all the labels in the current label set and the historical label set to obtain the live broadcast content label set.

2. The method of claim 1, wherein the merging and de-duplicating all tags in the current tag set and the historical tag set to obtain the live content tag set comprises:

acquiring a weight coefficient of each label in the historical label set;

combining and de-duplicating all the labels in the current label set and the historical label set to obtain a candidate label set, and recording the de-duplication times of all the labels in the candidate label set;

and screening a live broadcast content label set from the candidate label set based on the weight coefficient and the de-weighting times.

3. The method of claim 2, wherein obtaining a weight coefficient for each tag in the historical set of tags comprises:

acquiring the accumulated duplication removal times of all the labels in the historical label set;

determining the weight coefficient based on the accumulated deduplication number.

4. The method of claim 1, wherein,

the intercepting the live video comprises: intercepting the live video according to a first time interval; and

the acquiring the historical tag set of the live video comprises:

acquiring a previous video clip of the current video clip based on the intercepted time sequence;

in response to the previous video clip being successfully acquired, taking a live content tag set of the previous video clip as the historical tag set;

and in response to the failure of the acquisition of the previous video clip, taking the basic tag set of the live video as the historical tag set.

5. The method of claim 4, wherein the base set of tags is obtained by:

acquiring text description information and a cover picture of the live video;

extracting keywords from the text description information to obtain a text label;

performing character recognition on the cover picture to obtain a cover label;

and combining the text label and the cover label to obtain the basic label set.

6. The method of any of claims 1-5, wherein the content identifying the current video segment to obtain a current tag set comprises:

performing content identification on key frames included in the current video clip to obtain a first label set;

performing content identification on voice information included in the current video clip to obtain a second tag set;

and merging the first label set and the second label set to obtain the current label set.

7. A live recommendation apparatus, the apparatus comprising:

the intercepting module is configured to intercept the live video to obtain a current video segment;

the identification module is configured to identify the content of the current video clip to obtain a current tag set;

the updating module is configured to update a historical tag set of the live video based on the current tag set to obtain a tag set of live content, wherein the historical tag set is a tag set corresponding to all live videos before the current video clip in the current live broadcasting process;

a recommendation module configured to recommend the live video according to the live content tag set;

wherein the update module is configured to:

wherein the recommendation module comprises:

the characteristic obtaining sub-module is configured to obtain characteristic information of a target object, wherein the characteristic information of the target object comprises a target object interest tag, and the target object interest tag comprises a tag of a live video which is watched by the target object in the past;

a feature determination submodule configured to determine a anchor feature tag set based on a live history of an anchor of the live video;

a tag correction sub-module configured to correct the live content tagset by the anchor feature tagset;

a recommendation sub-module configured to recommend the live video to the target object in response to the corrected live content tab set matching the feature information;

wherein the recommendation sub-module is further configured to:

wherein the update module comprises:

the tag acquisition sub-module is configured to acquire a historical tag set of the live video;

and the first merging submodule is configured to merge and deduplicate all tags in the current tag set and the historical tag set to obtain the live content tag set.

8. The apparatus of claim 7, wherein the first merge sub-module comprises:

a weight obtaining unit configured to obtain a weight coefficient of each label in the history label set;

the label merging unit is configured to merge and deduplicate all labels in the current label set and the historical label set to obtain a candidate label set, and record the deduplicating times of all the labels in the candidate label set;

and the label screening unit is configured to screen a live content label set from the candidate label set based on the weight coefficient and the de-weighting times.

9. The apparatus of claim 8, wherein the weight obtaining unit comprises:

the times obtaining subunit is configured to obtain the accumulated duplication removing times of the labels in the historical label set;

a weight determination subunit configured to determine the weight coefficient based on the accumulated number of deduplication times.

10. The apparatus of claim 7, the intercept module comprising:

the intercepting submodule is configured to intercept the live video according to a first time interval; and

the tag acquisition sub-module includes:

a section acquiring unit configured to acquire a video section preceding the current video section based on the intercepted time sequence;

a first determining unit configured to take a live content tag set of the previous video segment as the history tag set in response to the previous video segment being successfully acquired;

a second determining unit configured to take a base tag set of the live video as the history tag set in response to the previous video segment acquisition failure.

11. The apparatus of claim 10, the base set of tags obtained by:

an information acquisition subunit configured to acquire text description information and a cover page image of the live video;

the text extraction subunit is configured to extract keywords from the text description information to obtain a text label;

the image identification subunit is configured to perform character identification on the cover image to obtain a cover label;

and the label combining subunit is configured to combine the text label and the cover label to obtain the basic label set.

12. The apparatus of any one of claims 7-11, wherein the identification module comprises:

the first identification submodule is configured to perform content identification on key frames included in the current video clip to obtain a first label set;

the second identification submodule is configured to perform content identification on voice information included in the current video clip to obtain a second tag set;

and the second merging submodule is configured to merge the first label set and the second label set to obtain the current label set.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.