CN111083503A - Method, device, equipment and storage medium for calculating similarity of live broadcast rooms - Google Patents

Method, device, equipment and storage medium for calculating similarity of live broadcast rooms Download PDF

Info

Publication number
CN111083503A
CN111083503A CN201811229391.9A CN201811229391A CN111083503A CN 111083503 A CN111083503 A CN 111083503A CN 201811229391 A CN201811229391 A CN 201811229391A CN 111083503 A CN111083503 A CN 111083503A
Authority
CN
China
Prior art keywords
user
live
live broadcast
broadcast room
behavior data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811229391.9A
Other languages
Chinese (zh)
Other versions
CN111083503B (en
Inventor
王璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Douyu Network Technology Co Ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201811229391.9A priority Critical patent/CN111083503B/en
Publication of CN111083503A publication Critical patent/CN111083503A/en
Application granted granted Critical
Publication of CN111083503B publication Critical patent/CN111083503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/252Processing of multiple end-users' preferences to derive collaborative data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26291Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for providing content or additional data updates, e.g. updating software modules, stored at the client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for calculating similarity between live broadcasts, wherein the method comprises the following steps: in the process of generating a live video stream in a live broadcast room, generating a user behavior data stream according to the user behavior of a user watching the live video stream in the live broadcast room; updating the index of the appointed live broadcast room according to the user behavior data stream; and calculating the similarity between the live webcasts according to the updated live webcast indexes. The method for calculating the similarity between the live broadcasting rooms in the embodiment of the invention realizes the calculation of the similarity between the live broadcasting rooms by adopting the real-time data, so that the similarity between the live broadcasting rooms has higher real-time performance, and the accurate similarity between the live broadcasting rooms can be obtained.

Description

Method, device, equipment and storage medium for calculating similarity of live broadcast rooms
Technical Field
The embodiment of the invention relates to the technical field of big data processing, in particular to a method, a device, equipment and a storage medium for calculating similarity between live broadcasts.
Background
In the field of big data application, personalized recommendation can be performed on a user according to mass data, for example, in a network live broadcast platform, a live broadcast room similar to a live broadcast room watched by the user is recommended to the user, the more accurate the similarity between the live broadcast rooms is, and the more in line with personal interests of the user is the live broadcast room recommended to the user.
Currently, the similarity calculation for the live broadcasting room is mainly based on offline data, where the offline data refers to data collected in real time and stored in a database according to a certain time period (usually according to a certain number of days), and when the similarity calculation for the live broadcasting room is performed, related data is read from the database to calculate the similarity between the live broadcasting rooms. In practical application, due to the delay of offline data storage, on one hand, user behavior data in a live broadcast room which appears newly in the same day cannot be stored in a database in time, so that the data of the newly appearing live broadcast room is missing or sparse, and the similarity between the newly appearing live broadcast room and other live broadcast rooms cannot be measured; on the other hand, the occurrence of the linkage event between the live broadcasting rooms can greatly increase the similarity between the live broadcasting rooms, and the user behavior data generated by the linkage event cannot be stored in the database in time, so that the accuracy of the similarity is influenced.
Therefore, the existing method for calculating the similarity of the live broadcast rooms based on the offline data has the problems that the real-time performance of the similarity between the live broadcast rooms is poor, and the similarity between the live broadcast rooms is inaccurate.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a storage medium for calculating similarity between live broadcasts, which aim to realize the calculation of the similarity between the live broadcasts by adopting real-time data and solve the problem that the similarity between the live broadcasts is inaccurate due to poor similarity real-time performance in the method for calculating the similarity between the live broadcasts in the prior art.
In a first aspect, an embodiment of the present invention provides a method for calculating similarity between broadcasts, including:
in the process of generating a live video stream in a live broadcast room, generating a user behavior data stream according to the user behavior of a user watching the live video stream in the live broadcast room;
updating the index of the appointed live broadcast room according to the user behavior data stream;
and calculating the similarity between the live webcasts according to the updated live webcast indexes.
In a second aspect, an embodiment of the present invention provides an apparatus for calculating similarity between broadcasts, including:
the user behavior data stream generation module is used for generating a user behavior data stream according to the user behavior of a user watching the live video stream in a live broadcast room in the process of generating the live video stream in the live broadcast room;
the live broadcast room index updating module is used for updating the specified live broadcast room index according to the user behavior data stream;
and the similarity calculation module is used for calculating the similarity between the live webcasts according to the updated live webcast indexes.
In a third aspect, an embodiment of the present invention provides an apparatus, where the apparatus includes:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for computing similarity between live broadcasts according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for computing similarity between live broadcasts according to any embodiment of the present invention.
According to the method for calculating the similarity between the live broadcast rooms, the user behavior data stream is generated according to the user behavior of the user watching the live broadcast video stream in the live broadcast room in the process of generating the live broadcast video stream in the live broadcast room, the specified live broadcast room index is updated according to the user behavior data stream, so that the similarity between the live broadcast rooms is calculated according to the updated live broadcast room index, the problem that the similarity between the live broadcast rooms is poor in real-time performance and inaccurate in similarity between the live broadcast rooms in the method for calculating the similarity between the live broadcast rooms is solved, the similarity between the live broadcast rooms is calculated by adopting real-time data, the similarity between the live broadcast rooms is high in real-time performance, and the accurate similarity between the live broadcast rooms can be obtained.
Drawings
Fig. 1 is a flowchart of a method for calculating similarity between live broadcasts according to an embodiment of the present invention;
fig. 2A is a flowchart of a method for calculating similarity between live broadcasts according to a second embodiment of the present invention;
fig. 2B is a flowchart of a method for collecting user behavior data in the method according to the second embodiment of the present invention;
fig. 2C is an architecture diagram of message middleware kafka in the method according to the second embodiment of the present invention;
fig. 2D is a flowchart of a method for updating indexes of a live broadcast room in the method according to the second embodiment of the present invention;
fig. 2E is a flowchart of a method for calculating similarity between live broadcasts in a method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an apparatus for calculating similarity between live broadcasts according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a method for calculating similarity between live broadcasts according to an embodiment of the present invention, where the embodiment of the present invention is applicable to a case where a network live broadcast platform calculates similarity between live broadcasts, and the method may be implemented by a device for calculating similarity between live broadcasts, where the device may be implemented in a software and/or hardware manner and is integrated in an apparatus for executing the method, and specifically, as shown in fig. 1, the method may include the following steps:
s101, in the process of generating a live video stream in a live broadcast room, generating a user behavior data stream according to a user behavior of a user watching the live video stream in the live broadcast room.
Specifically, the live broadcast system of the embodiment of the present invention may include a live broadcast platform and a client, where the live broadcast platform may be a server providing live broadcast services, and the live broadcast platform may be in communication connection with a plurality of clients through the internet. The live broadcasting platform creates a virtual live broadcasting room for the anchor client, a user can enter the live broadcasting room after logging in the live broadcasting platform through the anchor client or the audience client, the anchor generates live broadcasting video stream when the live broadcasting room carries out live broadcasting, and the user can watch the live broadcasting video stream through the audience client.
In the embodiment of the invention, the user behavior can be the behaviors of entering a live broadcast room by a user, exiting the live broadcast room by the user, interacting between the user and a main broadcast and the like, and the user behavior data related to the user behavior can be acquired and then generated into the user behavior data stream according to the preset format, wherein the user behavior data stream comprises the user behavior data of a plurality of users and is output in a stream form.
And S102, updating the specified live broadcast room index according to the user behavior data stream.
In the embodiment of the invention, a live broadcast room index database can be preset, and the live broadcast room index database comprises indexes such as a user set of each live broadcast room, the times of watching live broadcast video streams of the live broadcast rooms by users, the last watching time of watching the live broadcast video streams of each live broadcast room by the users and the like, wherein each live broadcast room index can be a variable which corresponds to a variable value, and the variable value can be updated in real time in the updating of the live broadcast room index each time.
For example, after a user behavior data stream is received, the corresponding live broadcast room index can be updated according to the user behavior data in the user behavior data stream, so that the live broadcast room index can be updated in real time, and the similarity between live broadcast rooms calculated by adopting the live broadcast room index has higher real-time performance and accuracy.
S103, calculating the similarity between the live webcasts according to the updated live webcast indexes.
Specifically, a trigger event for calculating the similarity of the live broadcast room may be set, where the trigger event may be a set time point, a set time period, or a monitored specified behavior of the user, and the specified behavior of the user may be a behavior that the user enters the live broadcast room, the user connects to the main broadcast in the live broadcast room, or the user battles with the main broadcast in the live broadcast room. And when the triggering event is monitored to be triggered, selecting two live broadcasting rooms from the live broadcasting rooms, and calculating the similarity between the two live broadcasting rooms by adopting the updated live broadcasting room indexes.
According to the method for calculating the similarity between the live broadcast rooms, the user behavior data stream is generated according to the user behavior of the user watching the live broadcast video stream in the live broadcast room in the process of generating the live broadcast video stream in the live broadcast room, the specified live broadcast room index is updated according to the user behavior data stream, so that the similarity between the live broadcast rooms is calculated according to the updated live broadcast room index, the problem that the similarity between the live broadcast rooms is poor in real-time performance and inaccurate in similarity between the live broadcast rooms in the method for calculating the similarity between the live broadcast rooms is solved, the similarity between the live broadcast rooms is calculated by adopting real-time data, the similarity between the live broadcast rooms is high in real-time performance, and the accurate similarity between the live broadcast rooms can be obtained.
Example two
Fig. 2A is a flowchart of a method for calculating similarity between live broadcasts according to a second embodiment of the present invention, where the first embodiment is based on optimizing generation of a user behavior data stream and updating of indexes of the live broadcasts. Specifically, as shown in fig. 2A, the method provided in the embodiment of the present invention may include the following steps:
s201, in the process of generating a live video stream in a live broadcast room, when it is monitored that the user exits from the live broadcast room and finishes watching the live video stream, user behavior data related to the user are collected.
In the embodiment of the present invention, in a process of generating a live video stream in a live broadcast room, a plurality of users generally watch the live video stream generated in the live broadcast room at the same time, and a user behavior data stream may be generated by using behavior data of each user, so that when it is monitored that a user exits the live broadcast room and finishes watching the live video stream, user behavior data related to the user may be collected, and optionally, as shown in fig. 2B, collecting user behavior data related to the user may specifically include:
s2011, acquiring a live broadcast room identifier of the live broadcast room, a user identifier of the user and watching ending time when the user finishes watching the live broadcast video stream;
s2012, counting the watching time of the user watching the live video stream.
Specifically, each live broadcast room has a unique live broadcast room identifier (live broadcast room ID), each user also has a unique user identifier (user ID), when the user exits the live broadcast room, the live broadcast room identifier of the live broadcast room, the user identifier of the user, and the viewing end time when the user exits the live broadcast room and finishes viewing the live broadcast video stream can be obtained, the viewing duration when the user views the live broadcast video stream of the live broadcast room is counted through the viewing end time and the time when the user enters the live broadcast room, and the live broadcast room identifier, the user identifier, the viewing end time, and the viewing duration are used as user behavior data.
S202, generating a user behavior data stream by adopting the user behavior data according to a preset data stream format.
Specifically, the user behavior data stream may be generated by adopting a live broadcast room identifier, a user identifier, a viewing end time, and a viewing duration according to a preset data stream format, for example, the preset data stream format may be as follows:
{"room_id":r,"uid":u,"end_time":et,"wat_time":wt}
wherein, the room _ id is a live broadcast time identifier, the uid is a user identifier, the end _ time is a viewing ending time, and the wat _ time is a viewing duration.
In practical application, the generation of streaming data can be realized through the message middleware kafka.
Referring to fig. 2C, the message middleware kafka is shown in an architecture diagram of a message middleware kafka, and referring to fig. 2C, the message middleware kafka includes a plurality of message producers, a plurality of message consumers and a kafka server, wherein the plurality of message producers and the plurality of message consumers are connected through the kafka server, wherein the plurality of message producers can be clients that send messages to the kafka server, the plurality of message consumers can be clients that read messages from the kafka server, in particular, in the embodiment of the present invention, the message producer can be an application program embedded in a live client or a live platform, through which monitoring of the live broadcast can be realized, when it is monitored that a live broadcast user exits the live broadcast, user behavior data related to the user is collected and continuously sent to the kafka server, the kafka server performs streaming processing and caching on the received user behavior data, for reading by multiple message consumers. For example, the plurality of message consumers may be business-side applications that compute similarities across streamers, such as storm streaming frameworks. The service end application program can read the user behavior data stream from the kafka server and update the live broadcast room index according to the user behavior data stream.
S203, extracting user behavior data from the user behavior data stream.
Specifically, the user behavior data stream is generated according to the user behavior data of each user when each user exits the live broadcast room and finishes watching the live broadcast video stream, and the user behavior data of each user can be extracted from the user behavior data stream after the user behavior data stream is received.
And S204, judging whether the user behavior data is valid or not.
In practical application, the user behavior data may be filtered to remove invalid user behavior data, for example, user behavior data generated by user misoperation is removed, specifically, the user behavior data includes viewing duration, and it may be determined whether the viewing duration included in the user behavior data is greater than a preset value; if yes, determining that the user behavior data is valid, and executing S205; if not, the user behavior data is determined to be invalid, and the user behavior data can be discarded.
For example, the preset value may be 10 seconds, and if the viewing duration is longer than 10 seconds, it indicates that the user is interested in the live broadcast; if the watching time is less than 10 seconds, the user may mistakenly operate to enter the live broadcast room or the user is not interested in the live broadcast room, and the problem that the similarity between the live broadcast rooms is inaccurate due to the fact that the indexes of the live broadcast rooms are updated by invalid user behavior data through judging the effectiveness of the user behavior data can be avoided, and the accuracy of the similarity between the live broadcast rooms is improved.
S205, searching for the live broadcast room index corresponding to the user behavior data, and updating the live broadcast room index according to the user behavior data.
In this embodiment of the present invention, the user behavior data may further include a live broadcast room identifier, a user identifier, and a viewing end time, the live broadcast room index may include a number of times that a user views a live broadcast video stream in the live broadcast room, a user set in the live broadcast room, and a last viewing time when the user finishes viewing the live broadcast video stream in the live broadcast room, as shown in fig. 2D, the live broadcast room index corresponding to the user behavior data is searched, and the live broadcast room index is updated according to the user behavior data, and specifically, the live broadcast room index may include:
s2051, searching for the watching times corresponding to the user identification;
s2052, accumulating the watching times;
s2053, searching a user set corresponding to the live broadcast room identifier;
s2054, if the user identifier is not recorded in the user set, writing the user identifier into the user set;
s2055, searching the last watching time corresponding to the user identifier and the live broadcast room identifier together;
s2056, covering the last viewing time with the viewing end time.
Specifically, the live broadcast room index may be expressed in a "key value" manner, that is, a corresponding value may be obtained by a key value of the key, and then the user identifier included in the user behavior data may be used as the key value to search for the viewing times corresponding to the user identifier, and add the viewing times, that is, add 1 to the current viewing times to obtain the times for which the user corresponding to the user identifier views the live broadcast video streams in all live broadcast rooms; searching a user set corresponding to the live broadcast room identification by taking the live broadcast room identification as a key value, and writing the user identification into the user set if the user identification contained in the user behavior data is not recorded in the user set; the method comprises the steps of combining a user identifier and a live broadcast room identifier as key values, searching the last watching time corresponding to the user identifier and the user identifier together, covering the searched last watching time by adopting the watching end time contained in user behavior data, and obtaining the last watching time of the live broadcast video stream of the live broadcast room corresponding to the live broadcast room identifier watched by the user corresponding to the user identifier.
And S206, calculating the similarity between the live webcasts according to the updated live webcast indexes.
In this embodiment of the present invention, the live broadcast room index includes the number of times that the user watches the live broadcast video stream in the live broadcast room, a user set in the live broadcast room, and a last watching time when the user finishes watching the live broadcast video stream in the live broadcast room, and optionally, as shown in fig. 2E, calculating the similarity between the live broadcast rooms according to the updated live broadcast room index may specifically include:
s2061, selecting a first live broadcast room and a second live broadcast room from the live broadcast rooms.
In an alternative embodiment, the first live room and the second live room may be two live rooms arbitrarily selected from all live rooms.
In another optional embodiment, when recommending a live broadcast to a user, the first live broadcast may be one of a live broadcast created for the user by a live broadcast platform, a live broadcast in which the user last watched a live video stream, and a live broadcast in which the user watched a live video stream for the most time or for the longest watching time, and the second live broadcast may be a live broadcast other than the first live broadcast.
S2062, acquiring a first user set of the first live broadcast room and a second user set of the second live broadcast room from the live broadcast room indexes.
Specifically, in the live broadcast room index, a first user set of a first live broadcast room can be searched through a live broadcast room identifier of the first live broadcast room, and a second user set of a second live broadcast room can be searched through a live broadcast room identifier of the second live broadcast room.
S2063, determining a third user set for watching the live video stream of the first live broadcast room and the live video stream of the second live broadcast room together based on the first user set and the second user set.
Specifically, an intersection may be taken between the first user set and the second user set to obtain a third user set, for example, the first user set is { U }1、U2、U3The second set of users is { U }1、U4And obtaining a third user set (U) after the intersection of the first user set and the second user set is taken1}。
S2064, obtaining a first last watching time and a second last watching time of each user in the third user set, where the first last watching time is a last watching time when the user watches a live video stream of the first live broadcast room, and the second last watching time is a last watching time when the user watches a live video stream of the second live broadcast room.
Specifically, in the live broadcast room index, a user identifier of each user and a live broadcast room identifier of a first live broadcast room can be used as key values, a value corresponding to the key value is searched to be the last watching time, and the last watching time is used as the first last watching time for each user to watch the live broadcast video stream of the first live broadcast room; similarly, the user identifier of each user and the live broadcast room identifier of the second live broadcast room are used as key values, the value corresponding to the key value is searched to be the last watching time, and the last watching time is used as the second last watching time for each user to watch the live broadcast video stream of the second live broadcast room.
S2065, obtaining the watching times of each user in the third user set watching the live broadcast video stream of the live broadcast room.
In this embodiment, in the index of the live broadcast room, the user identifier of each user may be a key value, and a value corresponding to the key value is found to be the number of viewing times of the user.
S2066, calculating the similarity between the first live broadcast room and the second live broadcast room based on the watching times of the live broadcast video stream of the live broadcast room watched by each user in the third user set and the first last watching time and the second last watching time of each user in the third user set.
Specifically, the similarity between the first live broadcast room and the second live broadcast room can be calculated by the following formula:
Figure BDA0001836750940000111
where sim (i, j) is the similarity between the first live broadcast room i and the second live broadcast room j, Ui is the first set of users in the first live broadcast room i, UjA second set of users of a second live broadcast room j, U being a third set of users watching the first live broadcast room i and the second live broadcast room j together, tuiIs the first last viewing time, t, of each user in the third set of users U viewing the first on-air time, iujIs the second last viewing time of each user in the third set of users U viewing the second live broadcast room j, δ is a weighting factor;
Figure BDA0001836750940000112
quis the number of times each user in the third set U watches the live video stream in the live room.
Calculating the similarity between the first live broadcast room and the second live broadcast room is described below with reference to an example:
when calculating the similarity, assume that the live broadcast room indexes are as follows:
the user set of the first live broadcast room is { U }1、U2、U3The user set of the second live broadcast room is { U }1、U4And if the users in the first live broadcast room and the second live broadcast room share the same user set, the user set is { U }1}。
User U1、U2、U3Respectively 10, 4, 6, 2 times, user U1The last watching time of the last watching first direct broadcasting is 10 months in 2018 and 10 am at 10 am, and a user U1And the last watching time of the last watching of the second live broadcast room is 12 am at 10 months and 10 days in 2018, and the weight coefficient delta is 0.5, then:
Figure BDA0001836750940000121
Figure BDA0001836750940000122
Figure BDA0001836750940000123
Figure BDA0001836750940000124
the similarity between the first live broadcast room and the second live broadcast room is as follows:
Figure BDA0001836750940000125
the greater the calculated similarity, the more similar the first live broadcast room and the second live broadcast room are.
According to the method for calculating the similarity between the live broadcast rooms, the user behavior data stream is generated according to the user behavior of the user watching the live broadcast video stream in the live broadcast room in the process of generating the live broadcast video stream in the live broadcast room, the specified live broadcast room index is updated according to the user behavior data stream, so that the similarity between the live broadcast rooms is calculated according to the updated live broadcast room index, the problem that the similarity between the live broadcast rooms is poor in real-time performance and inaccurate in similarity between the live broadcast rooms in the method for calculating the similarity between the live broadcast rooms is solved, the similarity between the live broadcast rooms is calculated by adopting real-time data, the similarity between the live broadcast rooms is high in real-time performance, and the accurate similarity between the live broadcast rooms can be obtained.
In an optional embodiment of the present invention, the service processing may also be performed according to the similarity between live broadcast rooms.
For different services, different forms of processing can be performed by adopting the similarity between live broadcast rooms.
In one service, live rooms can be pushed to users according to similarities between the live rooms.
For example, when it is monitored that the user enters the live broadcast room, the user connects to the main broadcast in the live broadcast room, the user fight against the main broadcast in the live broadcast room, and the like, the similarity between the current live broadcast room and other live broadcast rooms can be calculated, and a plurality of live broadcast rooms with the highest similarity with the current live broadcast room are displayed to the user.
For another example, the multiple live rooms with the highest similarity may be displayed in a recommendation bar of the live client page, or the multiple live rooms with the highest similarity may be displayed in the live room list after the user performs a refresh operation on the live room list on the live client page. Because the similarity is calculated for the live broadcast room by adopting the real-time data, the similarity has higher real-time performance and high accuracy, and the live broadcast room recommended to the user is closer to the requirement of the user.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an apparatus for calculating similarity between live broadcasts according to an embodiment of the present invention, as shown in fig. 3, the apparatus specifically includes:
a user behavior data stream generating module 301, configured to generate a user behavior data stream according to a user behavior of a user watching a live video stream in a live broadcast room in a process of generating the live broadcast video stream in the live broadcast room;
a live broadcast room index updating module 302, configured to update a specified live broadcast room index according to the user behavior data stream;
and the similarity calculation module 303 is configured to calculate the similarity between the live webcasts according to the updated live webcast indexes.
Optionally, the user behavior data stream generating module 301 includes:
the user behavior data acquisition sub-module is used for acquiring user behavior data related to the user when the user is monitored to exit the live broadcast room and finish watching the live broadcast video stream;
and the user behavior data stream generation submodule is used for generating the user behavior data stream by adopting the user behavior data according to a preset data stream format.
Optionally, the user behavior data includes a live broadcast room identifier, a user identifier, a viewing end time, and a viewing duration, and the user behavior data acquisition sub-module includes:
an identification and viewing end time obtaining unit, configured to obtain a live broadcast room identification of the live broadcast room, a user identification of the user, and a viewing end time when the user finishes viewing the live broadcast video stream;
the watching time counting unit is used for counting the watching time of the live video stream watched by the user;
the user behavior data stream generation submodule comprises:
and the user behavior data stream generating unit is used for generating the user behavior data stream by adopting the live broadcast room identifier, the user identifier, the watching ending time and the watching duration according to a preset data stream format.
Optionally, the live broadcast room index updating module 302 includes:
the user behavior data extraction submodule is used for extracting user behavior data from the user behavior data stream;
the user behavior data validity judging submodule is used for judging whether the user behavior data is valid or not;
and the live broadcast room index updating submodule is used for searching the live broadcast room index corresponding to the user behavior data and updating the live broadcast room index according to the user behavior data.
Optionally, the user behavior data includes a viewing duration, and the user behavior data validity judgment sub-module includes:
the watching time length judging unit is used for judging whether the watching time length is greater than a preset value or not;
the user behavior data validity determining unit is used for determining that the user behavior data is valid;
and the user behavior data invalidation determining unit is used for determining that the user behavior data is invalid.
Optionally, the user behavior data further includes a live broadcast room identifier, a user identifier, and a viewing end time, the live broadcast room index includes a number of times that the user views a live broadcast video stream of the live broadcast room, a user set of the live broadcast room, and a last viewing time that the user finishes viewing the live broadcast video stream of the live broadcast room, and the live broadcast room index updating sub-module includes:
the viewing frequency searching unit is used for searching the viewing frequency corresponding to the user identification;
the watching frequency accumulating unit is used for accumulating the watching frequency;
the user set searching unit is used for searching a user set corresponding to the live broadcast room identifier;
a user identifier writing unit, configured to write the user identifier into the user set if the user identifier is not recorded in the user set;
the last watching time searching unit is used for searching the last watching time which corresponds to the user identification and the live broadcast room identification together;
a last viewing time covering unit for covering the last viewing time with the viewing end time.
Optionally, the live broadcast room index includes a number of times that the user watches a live broadcast video stream of the live broadcast room, a user set of the live broadcast room, and a last watching time that the user finishes watching the live broadcast video stream of the live broadcast room, and the similarity calculation module 303 includes:
the live broadcast room selection submodule is used for selecting a first live broadcast room and a second live broadcast room from the live broadcast room;
a user set obtaining sub-module, configured to obtain, from the live broadcast room index, a first user set of the first live broadcast room and a second user set of the second live broadcast room;
a third user set determining submodule, configured to determine, based on the first user set and the second user set, a third user set that commonly watches live video streams of the first live broadcast room and live video streams of the second live broadcast room;
a last watching time obtaining sub-module, configured to obtain a first last watching time and a second last watching time of each user in the third user set, where the first last watching time is a last watching time when the user watches a live video stream of the first live broadcast room, and the second last watching time is a last watching time when the user watches a live video stream of the second live broadcast room;
the watching frequency obtaining submodule is used for obtaining the watching frequency of each user in the third user set watching the live video stream of the live broadcast room;
and the similarity calculation operator module is used for calculating the similarity between the first live broadcast room and the second live broadcast room based on the watching times of the live broadcast video stream of the live broadcast room watched by each user in the third user set and the first last watching time and the second last watching time of each user in the third user set.
Optionally, the method further comprises:
and the service processing module is used for processing the service according to the similarity between the live broadcasting rooms.
The device for calculating the similarity between the live broadcasts provided by the embodiment of the invention can execute the method for calculating the similarity between the live broadcasts provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention, as shown in fig. 4, the apparatus includes a processor 400, a memory 401, an input device 402, and an output device 403; the number of processors 400 in the device may be one or more, and one processor 400 is taken as an example in fig. 4; the processor 400, the memory 401, the input device 402 and the output device 403 of the apparatus may be connected by a bus or other means, as exemplified by the bus connection in fig. 4.
The memory 401 is used as a computer-readable storage medium and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the method for calculating similarity between live broadcasts in the embodiment of the present invention (for example, the user behavior data stream generating module 301, the live broadcast index updating module 302, and the similarity calculating module 303 in the apparatus for calculating similarity between live broadcasts). The processor 400 executes various functional applications of the device and data processing by running software programs, instructions and modules stored in the memory 401, that is, implements the above-described method for calculating similarity to a live broadcast.
The memory 401 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 401 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 401 may further include memory located remotely from processor 400, which may be connected to devices/terminals/servers through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 402 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the device/terminal/server. The output device 403 may include a display device such as a display screen.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for computing similarity between broadcasts, the method including:
in the process of generating a live video stream in a live broadcast room, generating a user behavior data stream according to the user behavior of a user watching the live video stream in the live broadcast room;
updating the index of the appointed live broadcast room according to the user behavior data stream;
and calculating the similarity between the live webcasts according to the updated live webcast indexes.
Of course, the storage medium provided by the embodiments of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the method for calculating similarity between broadcasts provided by any embodiments of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the apparatus for calculating similarity between live broadcasts, each unit and each module included in the apparatus is only divided according to functional logic, but is not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for computing similarity between live broadcasts, comprising:
in the process of generating a live video stream in a live broadcast room, generating a user behavior data stream according to the user behavior of a user watching the live video stream in the live broadcast room;
updating the index of the appointed live broadcast room according to the user behavior data stream;
and calculating the similarity between the live webcasts according to the updated live webcast indexes.
2. The method of claim 1, wherein generating a user behavior data stream from user behavior of a user viewing the live video stream in the live room comprises:
collecting user behavior data related to the user when the user is monitored to exit the live broadcast room and finish watching the live broadcast video stream;
and generating a user behavior data stream by adopting the user behavior data according to a preset data stream format.
3. The method of claim 2, wherein the user behavior data includes a live room identification, a user identification, a viewing end time, and a viewing duration, and wherein collecting user behavior data related to the user comprises:
acquiring a live broadcast room identifier of the live broadcast room, a user identifier of the user and watching ending time when the user finishes watching the live broadcast video stream;
counting the watching time of the user watching the live video stream;
the generating of the user behavior data stream by using the user behavior data according to the preset data stream format includes:
and generating a user behavior data stream by adopting the live broadcast room identifier, the user identifier, the watching ending time and the watching duration according to a preset data stream format.
4. The method of claim 1, 2 or 3, wherein updating specified live-space metrics based on the user behavior data stream comprises:
extracting user behavior data from the user behavior data stream;
judging whether the user behavior data is valid or not;
if yes, searching a live broadcast room index corresponding to the user behavior data, and updating the live broadcast room index according to the user behavior data.
5. The method of claim 4, wherein the user behavior data comprises a length of viewing time, and wherein determining whether the user behavior data is valid comprises:
judging whether the watching time length is greater than a preset value;
if yes, determining that the user behavior data is valid;
and if not, determining that the user behavior data is invalid.
6. The method of claim 4, wherein the user behavior data includes a viewing duration, a live room indicator, a user indicator, and a viewing end time, wherein the live room indicators include a number of times that a user viewed a live video stream of a live room, a set of users of the live room, and a last viewing time that the user ended viewing the live video stream of the live room, and wherein searching for the live room indicator corresponding to the user behavior data and updating the live room indicator based on the user behavior data comprises:
searching the watching times corresponding to the user identification;
accumulating the watching times;
searching a user set corresponding to the live broadcast room identifier;
if the user identification is not recorded in the user set, writing the user identification into the user set;
searching the last watching time corresponding to the user identification and the live broadcast room identification;
and covering the last viewing time with the viewing end time.
7. The method of claim 1, 2 or 3, wherein the live room indicators include a number of times a user viewed the live video stream of the live room, a set of users of the live room, and a last viewing time when the user finished viewing the live video stream of the live room, and wherein calculating the similarity between the live rooms according to the updated live room indicators comprises:
selecting a first live broadcast room and a second live broadcast room from the live broadcast rooms;
acquiring a first user set of the first live broadcast room and a second user set of the second live broadcast room from the live broadcast room index;
determining a third set of users who commonly watch the live video stream of the first live broadcast room and the live video stream of the second live broadcast room based on the first set of users and the second set of users;
acquiring a first last watching time and a second last watching time of each user in the third user set, wherein the first last watching time is the last watching time when the user watches the live video stream of the first live broadcast room, and the second last watching time is the last watching time when the user watches the live video stream of the second live broadcast room;
acquiring the watching times of each user in the third user set watching the live video stream of the live broadcast room;
calculating the similarity between the first live broadcast room and the second live broadcast room based on the watching times of the live broadcast video stream of the live broadcast room watched by each user in the third user set and the first last watching time and the second last watching time of each user in the third user set.
8. An apparatus for computing similarity between live broadcasts, comprising:
the user behavior data stream generation module is used for generating a user behavior data stream according to the user behavior of a user watching the live video stream in a live broadcast room in the process of generating the live video stream in the live broadcast room;
the live broadcast room index updating module is used for updating the specified live broadcast room index according to the user behavior data stream;
and the similarity calculation module is used for calculating the similarity between the live webcasts according to the updated live webcast indexes.
9. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of computing similarity across plays as recited in any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of calculating similarity between live broadcasts according to any one of claims 1 to 7.
CN201811229391.9A 2018-10-22 2018-10-22 Method, device, equipment and storage medium for calculating similarity of live broadcast rooms Active CN111083503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811229391.9A CN111083503B (en) 2018-10-22 2018-10-22 Method, device, equipment and storage medium for calculating similarity of live broadcast rooms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811229391.9A CN111083503B (en) 2018-10-22 2018-10-22 Method, device, equipment and storage medium for calculating similarity of live broadcast rooms

Publications (2)

Publication Number Publication Date
CN111083503A true CN111083503A (en) 2020-04-28
CN111083503B CN111083503B (en) 2022-02-22

Family

ID=70309729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811229391.9A Active CN111083503B (en) 2018-10-22 2018-10-22 Method, device, equipment and storage medium for calculating similarity of live broadcast rooms

Country Status (1)

Country Link
CN (1) CN111083503B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011032339A1 (en) * 2009-09-18 2011-03-24 中兴通讯股份有限公司 Method, system and device for real-time control of ppv (pay per view) service
CN105872836A (en) * 2016-03-30 2016-08-17 武汉斗鱼网络科技有限公司 Method and device for increasing user interactivity in live broadcast website
CN106560811A (en) * 2016-09-23 2017-04-12 武汉斗鱼网络科技有限公司 Direct broadcasting room recommending method and system based on broadcaster style
CN107613395A (en) * 2017-08-28 2018-01-19 武汉斗鱼网络科技有限公司 Recommend method and system in live room
CN108307208A (en) * 2018-01-10 2018-07-20 武汉斗鱼网络科技有限公司 Calculate method, storage medium, equipment and the system of direct broadcasting room similarity
CN108419135A (en) * 2018-03-22 2018-08-17 武汉斗鱼网络科技有限公司 Similarity determines method, apparatus and electronic equipment
CN108536814A (en) * 2018-04-04 2018-09-14 武汉斗鱼网络科技有限公司 Direct broadcasting room recommends method, computer readable storage medium and electronic equipment
CN108632669A (en) * 2017-03-23 2018-10-09 北京小唱科技有限公司 A kind of network main broadcaster real-time working amount acquisition methods and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011032339A1 (en) * 2009-09-18 2011-03-24 中兴通讯股份有限公司 Method, system and device for real-time control of ppv (pay per view) service
CN105872836A (en) * 2016-03-30 2016-08-17 武汉斗鱼网络科技有限公司 Method and device for increasing user interactivity in live broadcast website
CN106560811A (en) * 2016-09-23 2017-04-12 武汉斗鱼网络科技有限公司 Direct broadcasting room recommending method and system based on broadcaster style
CN108632669A (en) * 2017-03-23 2018-10-09 北京小唱科技有限公司 A kind of network main broadcaster real-time working amount acquisition methods and system
CN107613395A (en) * 2017-08-28 2018-01-19 武汉斗鱼网络科技有限公司 Recommend method and system in live room
CN108307208A (en) * 2018-01-10 2018-07-20 武汉斗鱼网络科技有限公司 Calculate method, storage medium, equipment and the system of direct broadcasting room similarity
CN108419135A (en) * 2018-03-22 2018-08-17 武汉斗鱼网络科技有限公司 Similarity determines method, apparatus and electronic equipment
CN108536814A (en) * 2018-04-04 2018-09-14 武汉斗鱼网络科技有限公司 Direct broadcasting room recommends method, computer readable storage medium and electronic equipment

Also Published As

Publication number Publication date
CN111083503B (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN108062409B (en) Live video abstract generation method and device and electronic equipment
CN104394436B (en) The monitoring method and device of the audience ratings of Internet TV live television channel
CN110287399B (en) Live broadcast information processing method and device, electronic equipment and storage medium
CN106649681B (en) Data processing method, device and equipment
CN110941738B (en) Recommendation method and device, electronic equipment and computer-readable storage medium
CN104462375A (en) Barrage media based search processing method and barrage media based search processing system
CN103218385A (en) Server apparatus, information terminal, and program
US8600969B2 (en) User interest pattern modeling server and method for modeling user interest pattern
JP2011234198A (en) Information providing method, content display terminal, mobile terminal, server device, information providing system, and program
CN108322350B (en) Service monitoring method and device and electronic equipment
US20130138673A1 (en) Information processing device, information processing method, and program
WO2012070179A1 (en) Segment creation device, segment creation method, and segment creation program
CN106415546A (en) Systems and methods for locally detecting consumed video content
KR20140043406A (en) Highly scalable audience measurement system with client event pre-processing
CN103997662A (en) Program pushing method and system
US20240106909A1 (en) Methods and apparatus to facilitate meter to meter matching for media identification
CN105843876A (en) Multimedia resource quality assessment method and apparatus
CN104853251A (en) Online collection method and device for multimedia data
CN110895594A (en) Page display method and related equipment
CN105100840A (en) Method and device for inter-cut of recommended media information
KR20150082074A (en) Service server and method for providing contents information
JP2002171231A (en) Broadcast program guiding system and its method and its device and broadcasting terminal equipment and program recording medium to be used for realization of the same device
CN103442264A (en) Audience rating analysis method and system based on video monitoring
CN109614417B (en) Data flow-based report index display method and device and terminal
CN111405237A (en) Cloud storage system providing preview function and preview method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant