CN108965916B - Live video evaluation method, model establishment method, device and equipment - Google Patents

Live video evaluation method, model establishment method, device and equipment Download PDF

Info

Publication number
CN108965916B
CN108965916B CN201710380359.XA CN201710380359A CN108965916B CN 108965916 B CN108965916 B CN 108965916B CN 201710380359 A CN201710380359 A CN 201710380359A CN 108965916 B CN108965916 B CN 108965916B
Authority
CN
China
Prior art keywords
live
live video
video
sensitive
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710380359.XA
Other languages
Chinese (zh)
Other versions
CN108965916A (en
Inventor
杨磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710380359.XA priority Critical patent/CN108965916B/en
Publication of CN108965916A publication Critical patent/CN108965916A/en
Application granted granted Critical
Publication of CN108965916B publication Critical patent/CN108965916B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a method for evaluating live video, which comprises the following steps: acquiring bullet screen data of a target live broadcast video; determining characteristics representing each sensitive dimension from the bullet screen data; inputting the characteristics of each sensitive dimension into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive content contained in the target live broadcast video; and when the sensitivity index is larger than the sensitivity threshold, determining the target live video as the sensitive video. The live video evaluation method provided by the embodiment of the application can effectively analyze whether the video content of the live network is sensitive or not, thereby effectively managing the live network.

Description

Live video evaluation method, model establishment method, device and equipment
Technical Field
The invention relates to the technical field of internet, in particular to a live video evaluation method, a live evaluation model establishment method, a live evaluation device and live evaluation equipment.
Background
With the rise of network live broadcast and network video playing, in order to attract more users and earn more traffic, the pornographic content live broadcast in the network live broadcast is also increased greatly.
How to effectively identify whether the content on demand or live broadcast by the user in the network is pornographic content or not and manage the live broadcast containing the pornographic content is a problem for each large internet platform provider.
It is currently recognized whether the captured picture contains pornographic content by capturing a picture of the video content. The identification mode is difficult to identify if only a small segment of pornographic content exists in a long-time video, and the method cannot be used for monitoring live broadcast content due to complexity, so that the live broadcast video content cannot be effectively managed.
Disclosure of Invention
In order to solve the problem that the video content of live webcast cannot be effectively managed in the prior art, embodiments of the present invention provide a live webcast video evaluation method, which can effectively analyze whether the video content of live webcast is sensitive, so as to effectively manage live webcast. The embodiment of the application also provides a method for establishing the live broadcast evaluation model, and a device and equipment corresponding to the method for evaluating the live broadcast video and the method for establishing the live broadcast evaluation model.
A first aspect of the present application provides a method for evaluating a live video, including:
acquiring bullet screen data of a target live broadcast video;
determining characteristics representing each sensitive dimension from the bullet screen data;
inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive contents contained in the target live broadcast video;
and when the sensitivity index is larger than a sensitivity threshold, determining that the target live video is a sensitive video.
A second aspect of the present application provides a method for establishing a live broadcast evaluation model, including:
acquiring barrage data of a first type of live broadcast and barrage data of a second type of live broadcast which are selected as samples, wherein the first type of live broadcast is sensitive live broadcast, and the second type of live broadcast is non-sensitive live broadcast;
and training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model.
A third aspect of the present application provides an apparatus for live video evaluation, including:
the acquisition program module is used for acquiring barrage data of the target live video;
the first determining program module is used for determining the characteristics representing each sensitive dimension from the bullet screen data acquired by the acquiring program module;
the second determining program module is used for inputting the characteristics of the sensitive dimensions determined by the first determining program module into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive content contained in the target live broadcast video;
and the third determining program module is used for determining that the target live video is a sensitive video when the sensitivity index determined by the second determining program module is greater than a sensitivity threshold.
The fourth aspect of the present application provides an apparatus for establishing a live broadcast assessment model, including:
the system comprises an acquisition program module, a storage module and a display module, wherein the acquisition program module is used for acquiring barrage data of a first type of live broadcast and barrage data of a second type of live broadcast which are selected as samples, the first type of live broadcast is sensitive live broadcast, and the second type of live broadcast is non-sensitive live broadcast;
and the model training program module is used for training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video acquired by the acquisition program module so as to obtain the live evaluation model.
A fifth aspect of the present application provides a computer device comprising: an input/output (I/O) interface, a processor, and a memory having stored therein instructions for live video evaluation of the first aspect;
the processor is configured to execute instructions for live video assessment stored in the memory, to perform the steps of the method for live video assessment as described in the first aspect.
A sixth aspect of the present application provides a computer device comprising: an input/output (I/O) interface, a processor, and a memory, the memory having stored therein instructions for the live evaluation model establishment of the second aspect;
the processor is configured to execute the instructions of the live evaluation model establishment stored in the memory, and to perform the steps of the method of the live evaluation model establishment according to the second aspect.
Yet another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.
Yet another aspect of the present application provides a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of the above-described aspects.
Compared with the prior art that live video content cannot be managed well, the live video evaluation method provided by the embodiment of the application can effectively analyze whether live video content is sensitive or not, and therefore live video can be managed effectively.
Drawings
FIG. 1 is a schematic diagram of a network architecture of a distributed system according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a distributed system in a simulation scenario in the embodiment of the present application;
FIG. 3 is a schematic diagram of an embodiment of a method for establishing a live broadcast evaluation model in an embodiment of the present application;
fig. 4 is a schematic diagram of another embodiment of a method for establishing a live broadcast evaluation model provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of an embodiment of a method for evaluating live video in an embodiment of the present application;
FIG. 6 is a schematic diagram of an example of a scene of a method for evaluating live video in an embodiment of the present application;
fig. 7 is a schematic diagram of another embodiment of a method for evaluating live video in an embodiment of the present application;
FIG. 8 is a schematic diagram of an embodiment of an apparatus for evaluating live video in an embodiment of the present application;
fig. 9 is a schematic diagram of another embodiment of an apparatus for evaluating live video in an embodiment of the present application;
fig. 10 is a schematic diagram of an embodiment of an apparatus for establishing a live broadcast evaluation model in an embodiment of the present application;
FIG. 11 is a schematic diagram of an embodiment of a computer device in an embodiment of the present application.
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. As can be appreciated by those skilled in the art, as technology advances, the technical solutions provided by the embodiments of the present invention are also applicable to similar technical problems.
The embodiment of the invention provides a live video evaluation method which can effectively analyze whether the video content of live webcast is sensitive or not, thereby effectively managing the live webcast. The embodiment of the application also provides a corresponding device. The following are detailed below.
The network live broadcast refers to online live broadcast through an internet platform, a plurality of network live broadcast platforms exist at present, and a user can select a live broadcast room on the live broadcast platform after installing corresponding live broadcast platform software and enter into the live broadcast content watching room.
Because there is no requirement for the anchor broadcast on the network platform, the content broadcast by the anchor broadcast is also five-door, including live broadcast for eating, live broadcast for makeup, live broadcast for playing games, and live broadcast for some sensitive content.
The platform provider needs to perform screenshot on the video content of each live broadcast room on the platform to manage the sensitive content, and then analyzes whether the screenshot contains the sensitive content, but the management mode is large in processing amount and may not be accurate enough in screenshot analysis, so that the platform provider still cannot effectively manage the video content of the live broadcast network.
Live broadcasts interact with the audience through a barrage. Live unhealthy content is necessarily intended to attract viewers, people. Thus, unhealthy barrage has emerged during the interaction, while the unhealthy barrage also extends throughout the live broadcast process. Therefore, text analysis is carried out on the bullet screen in the embodiment of the application, monitoring in different degrees is carried out according to analysis results, if a large number of unhealthy words appear and the flow of the live broadcast room is increased rapidly, the type of the bullet screen needs to be monitored in a key mode, and live broadcast video content can be effectively managed.
Managing videos according to barrages in live videos, and establishing a live evaluation model.
The embodiment of the application relates to establishment and use of a live broadcast evaluation model, wherein the live broadcast evaluation model is used for calculating a sensitivity index of a live broadcast video, and the sensitivity index can reflect the proportion of sensitive contents contained in the target live broadcast video.
Whether the live evaluation model is built or used, the evaluation model is implemented based on background computing devices, which are usually in the form of clusters, but may be independent computing devices. The following description will be made by taking the form of a cluster as an example.
As shown in fig. 1, fig. 1 is a schematic diagram of a network architecture of a distributed system according to an embodiment of the present application;
the distributed system provided by the embodiment of the application comprises a Master control node (Master)10, a network 20 and a plurality of working nodes (Worker)30, wherein the Master control node 10 and the working nodes 30 can communicate through the network 20, the working nodes in the distributed system are responsible for storing various live videos, the Master control node can be a device for evaluating the live videos or a device for establishing a live evaluation model in the embodiment of the application, and is responsible for establishing the live evaluation model or determining the sensitivity index of the live video according to the live videos stored in the working nodes, and the sensitivity index reflects the proportion of sensitive contents contained in the target live video. When the video is determined to be sensitive video, alarm prompt information can be sent to background management personnel. In the embodiment of the present application, there may be one or more master nodes 10. For example, in order to ensure the reliability of the system, a standby main control node may be deployed to share a part of the load when the load of the currently operating main control node is too high, or to take over the work of the main control node when the currently operating main control node fails. Both the master node 10 and the worker node 30 may be physical hosts.
The distributed system may also be a virtualized system, which may take the form shown in fig. 2 in a virtualization scenario in which the distributed system includes a hardware layer and a Virtual Machine Monitor (VMM)1001 running above the hardware layer, and a plurality of virtual machines 1002. One or more virtual machines may be selected as master nodes and a plurality of virtual machines as worker nodes.
Specifically, virtual machine 1002: one or more virtual computers are simulated on common hardware resources through virtual machine software, the virtual machines work like real computers, an operating system and an application program can be installed on the virtual machines, and the virtual machines can also access network resources. For applications running in a virtual machine, the virtual machine operates as if it were a real computer.
Hardware layer: the hardware platform on which the virtualized environment operates may be abstracted from the hardware resources of one or more physical hosts. The hardware layer may include various hardware, including, for example, a processor 1004 (e.g., CPU) and a memory 1005, and may also include a network card 1003 (e.g., RDMA network card), high-speed/low-speed Input/Output (I/O) devices, and other devices with specific processing functions.
In addition, the distributed system under the virtualization scenario may further include a Host (Host): as management layer, it is used to complete the management and allocation of hardware resources; presenting a virtual hardware platform for a virtual machine; and the scheduling and isolation of the virtual machine are realized. Wherein, the Host may be a Virtual Machine Monitor (VMM); in addition, sometimes the VMM and 1 privileged virtual machine cooperate, the combination of which constitutes the Host. The virtual hardware platform provides various hardware resources for each virtual machine running thereon, such as a virtual processor (e.g., VCPU), a virtual memory, a virtual disk, a virtual network card, and the like. The virtual disk may correspond to a file of the Host or a logical block device. The virtual machine runs on a virtual hardware platform prepared for the Host, and one or more virtual machines run on the Host.
Privileged virtual machines: a special virtual machine, also called a driver domain, for example, is called Dom0 on the Xen Hypervisor platform, and a driver of a real physical device, such as a network card or a SCSI disk, is installed in the virtual machine, and can detect and directly access the real physical device. Other virtual machines access the real physical device through the privileged virtual machine using the corresponding mechanisms provided by Hypervisor.
The above is a description of the network architecture of the present application, and the method for establishing the live broadcast evaluation model in the embodiment of the present application is described first with reference to the network architecture.
As shown in fig. 3, an embodiment of a method for establishing a live broadcast evaluation model provided in an embodiment of the present application includes:
101. the method comprises the steps of obtaining first type live barrage data and second type live barrage data which are selected as samples, wherein the first type live broadcast is sensitive live broadcast, and the second type live broadcast is non-sensitive live broadcast.
When the live broadcast evaluation model is established, live broadcast videos containing sensitive contents need to be screened from historical live broadcast videos in advance, for example: the live video containing pornographic content needs to be screened out, and the normal live video not containing sensitive content needs to be screened out.
102. And training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model.
Compared with the prior art that the live broadcast video content cannot be effectively analyzed, the live broadcast evaluation model establishment method provided by the embodiment of the application can establish the live broadcast evaluation model through the barrage data, so that live broadcast video in a live broadcast room can be effectively analyzed, and a platform provider can timely manage live broadcast video containing sensitive content.
Wherein, according to the barrage data of the first type of live broadcast video and the barrage data of the second type of live broadcast video, training an initial evaluation model to obtain the live broadcast evaluation model may include:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure DEST_PATH_GDA0001370607870000061
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix and comprises the characteristics which represent each sensitive dimension in each live video;
theta is a weight matrix; the number of theta corresponds to the number of x, and theta isTA transposed matrix representing θ;
and training through the characteristics representing the sensitive dimensionality in each live video to obtain the value of the theta so as to obtain the live evaluation model.
Wherein, the determining of the characteristics of each sensitive dimension represented in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video comprises:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
In this application embodiment, the bullet screen data includes bullet screen text and bullet screen flow data two parts, and training bullet screen text and training bullet screen flow data are the bullet screen text and the bullet screen flow data that are used for training the live aassessment model.
The bullet screen text is characters input by a live broadcast room where the audience is located, and the bullet screen flow data comprises at least one of the number of bullet screens, the clicked increment of each live broadcast video and the clicked proportion of the sensitive video in the object of each live broadcast video.
Wherein the amount of increase in each live video clicked is the amount of viewer increase for that live video. Clicking the proportion of the objects of each live video, of which the sensitive video is clicked: the ratio of the number of audiences who have watched the sensitive video history to the number of all audiences aiming at the live video is referred.
Regarding the value of the feature of each sensitive dimension, if each condition includes sensitive content, the value of the dimension may be 1, and if the condition does not include sensitive content, the value of the dimension may be 0.
The value of each dimension is the value of each x in the x matrix, and because the sample data is used in the model training process, hθ(x) The value of (2) is known, and the value of theta can be trained through a plurality of samples, so that a live broadcast evaluation model is obtained.
Taking the erotic live broadcast as an example, the following describes a training process of a live broadcast evaluation model in the embodiment of the present application with reference to fig. 4.
The method comprises the steps that model training personnel mark bullet screen data of each video according to bullet screen total data and pre-stored historical data, if the bullet screen data of a certain video comprises pornographic content, the bullet screen data of the certain video is marked as 0, and if the bullet screen data of the certain video does not comprise pornographic content, the bullet screen data is normal video bullet screen data, the bullet screen data of the certain video is marked as 1.
A: and dividing pornographic bullet screen data from the bullet screen total data.
If the label is 0, the erotic barrage data can be divided according to the label of each video barrage data.
B: and dividing normal bullet screen data from the bullet screen full data.
If the label is 1, the normal bullet screen data can be divided according to the label of each video bullet screen data.
C: and respectively extracting bullet screen texts aiming at the bullet screen data of each video from the pornographic bullet screen data and the normal bullet screen data.
After obtaining the bullet screen text, converting the bullet screen text into data which can be identified by a computer, wherein the most common dictionary word segmentation method in text identification is adopted, and the commonly used prepositions ' are removed, namely, the Chinese language and the character assistant ' are obtained, and hiccup ' are removed; the corresponding dictionary then finds the number of times a word appears within the text.
D: and respectively extracting the sensitive features of each sensitive dimension aiming at the bullet screen data of each video from the erotic bullet screen data and the normal bullet screen data.
In this embodiment, the sensitive dimension is a pornographic dimension, and the sensitive feature is a pornographic feature.
The example sensitive features may include the number of barrages, the speed of increase per second of the number of live viewers, the percentage of users watching pornographic live broadcasts among live users, etc.
E: and calculating the word frequency of the sensitive words from the bullet screen text.
Figure DEST_PATH_GDA0001370607870000081
The numerator is the number of occurrences of the sensitive word, and the denominator is the total number of occurrences of all words.
In this embodiment, the numerator is the number of times the pornograph appears, and the denominator is the total number of all words in the bullet screen of the video.
F: modeling using a classification algorithm.
The process is to use a Logistic Regression (LR) algorithm to perform initial evaluation on a model by utilizing the characteristics of pornographic word frequency and pornographic dimensionality of each video in a sample video
Figure DEST_PATH_GDA0001370607870000091
And training to obtain a weight matrix theta.
Since all sample data is used in the model training process, hθ(x) The value of (2) is known, and the value of theta can be trained through a plurality of samples, so that a live broadcast evaluation model is obtained.
G: and (5) a live broadcast evaluation model.
The above is a description of a process of establishing a live evaluation model, and a process of evaluating a target live video using a live evaluation model in an embodiment of the present application is described below with reference to fig. 5.
Referring to fig. 5, an embodiment of a method for evaluating a live video provided in an embodiment of the present application includes:
201. and acquiring bullet screen data of the target live broadcast video.
The target live video may be a video being live in a live room of the network platform. The bullet screen data comprises all data related to the bullet screen, including bullet screen texts, users sending the bullet screen, the number of the bullet screen, the user growth amount of the bullet screen and the like.
202. And determining the characteristics representing the sensitive dimensions from the bullet screen data.
This step may include:
extracting a bullet screen text and bullet screen flow data from the bullet screen data;
determining the word frequency characteristics of sensitive words contained in the bullet screen text, and determining at least one of the number of bullet screens, the clicked increment of the target live video and the proportion of the clicked sensitive video in the object clicking the target live video from the bullet screen flow data.
The bullet screen text is characters input by a live broadcast room where the audience is located, and the bullet screen flow data comprises at least one of the number of bullet screens, the clicked increment of each live broadcast video and the clicked proportion of the sensitive video in the object of each live broadcast video.
And the increase amount of the clicked target live video is the increase amount of the audience aiming at the target live video. Clicking the ratio of the clicked sensitive video in the object of the target live video: the ratio of the number of audiences who have watched the sensitive video history to the number of all audiences aiming at the target live video is referred.
203. Inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive contents contained in the target live broadcast video.
This step may include:
determining the sensitivity index of the target live video according to a live evaluation model;
Figure DEST_PATH_GDA0001370607870000101
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix, and the x comprises the word frequency characteristics and at least one of the clicked growth amount of the target live video and the clicked proportion of the clicked sensitive video in the object clicking the target live video;
theta is a weight matrix, values of all weights in the weight matrix are obtained in the live broadcast evaluation model training process, the number of theta corresponds to the number of x, and theta isTRepresenting a transposed matrix of theta.
In this calculation, since the values of x and θ are known, h can be calculatedθ(x) The value of (c) is, for example: h isθ(x)=0.7。
204. And when the sensitivity index is larger than a sensitivity threshold, determining that the target live video is a sensitive video.
If the sensitivity threshold is 0.5, then h can be determinedθ(x) If the value is more than 0.5, the target live video is a sensitive video, and if h is greater than 0.5, the target live video is a sensitive videoθ(x) And if the target live video is less than 0.5, the target live video is a non-sensitive video. Then an alarm prompt message may be sent to the live broadcast management background.
As shown in fig. 6, one "anchor" dress skirt in the live room of the network live platform is in the hot dance, and the thigh still can be exposed to the hot dance in-process, and the skirt is also the tedding, and the number of audience of this live room constantly increases at the hot dance in-process, and a lot of audiences still leave a message in the barrage, if: audience one: barrage content 1, audience two: barrage content 22, viewer three: bullet screen content 3, etc. Many viewers also enjoy the hot reheat of the main play of live broadcast.
Background equipment can acquire the barrage data of this live broadcast room at this anchor live broadcast in-process, including the barrage text, if: audience one: barrage content 1, audience two: barrage content 22, viewer three: bullet screen content 3, etc. The number of viewers, the growth of viewers, the proportion of viewers watching pornographic videos, and the like are also obtained.
The background equipment inputs the sensitive dimension characteristics into a live broadcast evaluation model, and calculates a sensitivity index hθ(x) If h isθ(x) If the video broadcasted in the live broadcast room is 0.7, determining that the video broadcasted in the live broadcast room is a pornographic video, and sending an alarm prompt message to the live broadcast room, for example: "pay attention to live content, otherwise close room".
Compared with the prior art that live video content cannot be managed well, the live video evaluation method provided by the embodiment of the application can effectively analyze whether live video content is sensitive or not, and therefore live video can be managed effectively.
In conjunction with the establishment process of the live broadcast evaluation model shown in fig. 4, the method for evaluating live broadcast video provided in the embodiment of the present application can be understood with reference to the process shown in fig. 7. In practice, the model training process may be an off-line training process or an on-line training process.
A: and dividing pornographic bullet screen data from the bullet screen total data.
If the label is 0, the erotic barrage data can be divided according to the label of each video barrage data.
B: and dividing normal bullet screen data from the bullet screen full data.
If the label is 1, the normal bullet screen data can be divided according to the label of each video bullet screen data.
C: and respectively extracting bullet screen texts aiming at the bullet screen data of each video from the pornographic bullet screen data and the normal bullet screen data.
After obtaining the bullet screen text, converting the bullet screen text into data which can be identified by a computer, wherein the most common dictionary word segmentation method in text identification is adopted, and the commonly used prepositions ' are removed, namely, the Chinese language and the character assistant ' are obtained, and hiccup ' are removed; the corresponding dictionary then finds the number of times a word appears within the text.
D: and respectively extracting the sensitive features of each sensitive dimension aiming at the bullet screen data of each video from the erotic bullet screen data and the normal bullet screen data.
In this embodiment, the sensitive dimension is a pornographic dimension, and the sensitive feature is a pornographic feature.
The example sensitive features may include the number of barrages, the speed of increase per second of the number of live viewers, the percentage of users watching pornographic live broadcasts among live users, etc.
E: and calculating the word frequency of the sensitive words from the bullet screen text.
Figure DEST_PATH_GDA0001370607870000111
The numerator is the number of occurrences of the sensitive word, and the denominator is the total number of occurrences of all words.
In this embodiment, the numerator is the number of times the pornograph appears, and the denominator is the total number of all words in the bullet screen of the video.
F: modeling using a classification algorithm.
The process is to use a Logistic Regression (LR) algorithm to perform initial evaluation on a model by utilizing the characteristics of pornographic word frequency and pornographic dimensionality of each video in a sample video
Figure DEST_PATH_GDA0001370607870000121
And training to obtain a weight matrix theta.
Since all sample data is used in the model training process, hθ(x) The value of (2) is known, and the value of theta can be trained through a plurality of samples, so that a live broadcast evaluation model is obtained.
G: and (5) a live broadcast evaluation model.
H: and acquiring live broadcast room bullet screen data.
And extracting the bullet screen text and the sensitive characteristics of the bullet screen data of the live broadcast room, and performing word frequency analysis on the bullet screen text.
And inputting the word frequency and the sensitive characteristics into a live broadcast evaluation model for prediction.
I: the predicted outcome includes normal and malicious.
If the live broadcast is malicious, referring to fig. 6, an alarm prompt may be sent to a background manager, the background manager pays attention to the live broadcast room, and then the alarm prompt is sent to the live broadcast room, or the alarm prompt may be sent directly to the live broadcast room.
In the above description of the process of establishing a live broadcast evaluation model and performing live broadcast evaluation using the live broadcast evaluation model, a live broadcast video evaluation apparatus and a live broadcast evaluation model establishment apparatus in the embodiments of the present application are described below with reference to the accompanying drawings.
Referring to fig. 8, an embodiment of the apparatus 40 for evaluating live video according to the present application includes:
an obtaining program module 401, configured to obtain barrage data of a target live video;
a first determining program module 402, configured to determine, from the bullet screen data acquired by the acquiring program module 401, features representing various sensitive dimensions;
a second determining program module 403, configured to input the features of the sensitive dimensions determined by the first determining program module 402 into a live broadcast evaluation model, and determine a sensitivity index of the target live broadcast video, where the sensitivity index reflects a specific gravity of sensitive content included in the target live broadcast video;
a third determining program module 404, configured to determine that the target live video is a sensitive video when the sensitivity index determined by the second determining program module 403 is greater than a sensitivity threshold.
Compared with the prior art that live video content cannot be managed well, the live video evaluation device provided by the embodiment of the application can effectively analyze whether live video content is sensitive or not, and therefore live video can be managed effectively.
Alternatively, in another embodiment of the apparatus 40 for live video evaluation provided by the embodiment of the present application,
the first determining program module 402 is for:
extracting a bullet screen text and bullet screen flow data from the bullet screen data;
determining the word frequency characteristics of sensitive words contained in the bullet screen text, and determining at least one of the number of bullet screens, the clicked increment of the target live video and the proportion of the clicked sensitive video in the object clicking the target live video from the bullet screen flow data.
Alternatively, in another embodiment of the apparatus 40 for live video evaluation provided by the embodiment of the present application,
the second determination program module 403 is configured to:
determining the sensitivity index of the target live video according to a live evaluation model;
Figure DEST_PATH_GDA0001370607870000131
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix, and the x comprises the word frequency characteristics and at least one of the clicked growth amount of the target live video and the clicked proportion of the clicked sensitive video in the object clicking the target live video;
theta is a weight matrix, values of all weights in the weight matrix are obtained in the live broadcast evaluation model training process, the number of theta corresponds to the number of x, and theta isTRepresenting a transposed matrix of theta.
Optionally, referring to fig. 9, in another embodiment of the apparatus for live video assessment provided in this application, the apparatus 40 further includes a model training program module 405,
the obtaining program module 401 is further configured to obtain barrage data of a first type of live video and barrage data of a second type of live video that are selected as samples, where the first type of live video is sensitive live video and the second type of live video is non-sensitive live video;
and a model training program module 405, configured to train an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video acquired by the acquisition program module 401, so as to obtain the live evaluation model.
Alternatively, in another embodiment of the apparatus 40 for live video evaluation provided by the embodiment of the present application,
the model training program module 405 is used to:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure DEST_PATH_GDA0001370607870000141
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix and comprises the characteristics which represent each sensitive dimension in each live video;
theta is a weight matrix; the number of theta corresponds to the number of x, and theta isTA transposed matrix representing θ;
and training through the characteristics representing the sensitive dimensionality in each live video to obtain the value of the theta so as to obtain the live evaluation model.
Alternatively, in another embodiment of the apparatus 40 for live video evaluation provided by the embodiment of the present application,
the model training program module 405 is used to:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
The above is a description of an apparatus for evaluating a live video, which can be understood by referring to fig. 1 to fig. 7, and will not be repeated herein.
Referring to fig. 10, an embodiment of the apparatus 50 for establishing a live broadcast evaluation model provided in the embodiment of the present application includes:
an obtaining program module 501, configured to obtain barrage data of a first type of live broadcast and barrage data of a second type of live broadcast, where the first type of live broadcast is a sensitive live broadcast, and the second type of live broadcast is a non-sensitive live broadcast, which are selected as samples;
a model training program module 502, configured to train an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video acquired by the acquisition program module 501, so as to obtain the live evaluation model.
Compared with the prior art in which live video content cannot be effectively analyzed, the device for establishing the live evaluation model provided by the embodiment of the application can establish the live evaluation model through barrage data, so that live video in a live broadcast room can be effectively analyzed, and timely management of a platform provider on live video containing sensitive content is facilitated.
Alternatively, in another embodiment of the apparatus 50 for establishing a live broadcast evaluation model provided in the embodiment of the present application,
the model training program module 502 is configured to:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure DEST_PATH_GDA0001370607870000151
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix and comprises the characteristics which represent each sensitive dimension in each live video;
theta is a weight matrix; the number of theta corresponds to the number of x, and theta isTA transposed matrix representing θ;
and training through the characteristics representing the sensitive dimensionality in each live video to obtain the value of the theta so as to obtain the live evaluation model.
Alternatively, in another embodiment of the apparatus 50 for establishing a live broadcast evaluation model provided in the embodiment of the present application,
the model training program module 502 is configured to:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
The device 50 for establishing a live broadcast evaluation model provided in the embodiment of the present application can be understood with reference to the corresponding descriptions in fig. 1 to fig. 7, and repeated descriptions are not repeated here.
Fig. 11 is a schematic structural diagram of a computer device 60 according to an embodiment of the present invention. The computer device 60 includes a processor 610, a memory 650, and a transceiver 630, and the memory 650 may include a read-only memory and a random access memory, and provides operating instructions and data to the processor 610. A portion of the memory 650 may also include non-volatile random access memory (NVRAM).
In some embodiments, memory 650 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof:
in an embodiment of the present invention, by calling the operation instructions stored in the memory 650 (which may be stored in the operating system),
acquiring bullet screen data of a target live broadcast video;
determining characteristics representing each sensitive dimension from the bullet screen data;
inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive contents contained in the target live broadcast video;
and when the sensitivity index is larger than a sensitivity threshold, determining that the target live video is a sensitive video.
Compared with the prior art that live video content cannot be managed well, the computer equipment provided by the embodiment of the application can effectively analyze whether live video content is sensitive or not, and therefore live video content can be managed effectively.
Processor 610 controls the operation of computer device 60, and processor 610 may also be referred to as a CPU (Central Processing Unit). Memory 650 may include both read-only memory and random-access memory, and provides instructions and data to processor 610. A portion of the memory 650 may also include non-volatile random access memory (NVRAM). The various components of computer device 60 are coupled together by a bus system 620 in a particular application, where bus system 620 may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, however, the various buses are labeled in the figure as bus system 620.
The method disclosed in the above embodiments of the present invention may be applied to the processor 610, or implemented by the processor 610. The processor 610 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 610. The processor 610 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 650, and the processor 610 reads the information in the memory 650 and performs the steps of the above method in combination with the hardware thereof.
Optionally, the processor 610 is configured to:
extracting a bullet screen text and bullet screen flow data from the bullet screen data;
determining the word frequency characteristics of sensitive words contained in the bullet screen text, and determining at least one of the number of bullet screens, the clicked increment of the target live video and the proportion of the clicked sensitive video in the object clicking the target live video from the bullet screen flow data.
Optionally, the processor 610 is configured to:
determining the sensitivity index of the target live video according to a live evaluation model;
Figure DEST_PATH_GDA0001370607870000171
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix, and the x comprises the word frequency characteristics and at least one of the clicked growth amount of the target live video and the clicked proportion of the clicked sensitive video in the object clicking the target live video;
theta is a weight matrix, values of all weights in the weight matrix are obtained in the live broadcast evaluation model training process, the number of theta corresponds to the number of x, and theta isTRepresenting a transposed matrix of theta.
Optionally, the processor 610 is further configured to:
acquiring barrage data of a first type of live video and barrage data of a second type of live video which are selected as samples, wherein the first type of live video is sensitive live broadcast, and the second type of live video is non-sensitive live broadcast;
and training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model.
Optionally, the processor 610 is configured to:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure DEST_PATH_GDA0001370607870000181
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix and comprises the characteristics which represent each sensitive dimension in each live video;
theta is a weight matrix; the number of theta corresponds to the number of x, and theta isTA transposed matrix representing θ;
and training through the characteristics representing the sensitive dimensionality in each live video to obtain the value of the theta so as to obtain the live evaluation model.
Optionally, the processor 610 is configured to:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
The above description of the computer device 60 can be understood with reference to the description of fig. 1 to 7, and will not be repeated herein.
In addition, the computer device in the embodiment of the present application may further implement a function of a device for establishing a live broadcast evaluation model, that is, an actual physical form of the device for establishing a live broadcast evaluation model may also be understood with reference to the hardware structure in fig. 11, where the corresponding hardware function is configured to complete the following steps:
the processor 610 is configured to:
acquiring barrage data of a first type of live broadcast and barrage data of a second type of live broadcast which are selected as samples, wherein the first type of live broadcast is sensitive live broadcast, and the second type of live broadcast is non-sensitive live broadcast;
and training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model.
Optionally, the processor 610 is configured to:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure DEST_PATH_GDA0001370607870000191
wherein the content of the first and second substances,
hθ(x) And g (theta)Tx) is the sensitivity index;
x is a matrix and comprises the characteristics which represent each sensitive dimension in each live video;
theta is a weight matrix; the number of theta corresponds to the number of x, theθTA transposed matrix representing θ;
and training through the characteristics representing the sensitive dimensionality in each live video to obtain the value of the theta so as to obtain the live evaluation model.
Optionally, the processor 610 is configured to:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
The above description of the corresponding functions in the computer device can be understood with reference to the corresponding descriptions in fig. 1 to fig. 7, and will not be repeated herein.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
The method for evaluating the live video, the method for establishing the live evaluation model, the device for establishing the live evaluation model and the equipment provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (15)

1. A method of live video assessment, comprising:
acquiring bullet screen data of a target live broadcast video;
determining features representing each sensitive dimension from the bullet screen data, wherein the features representing each sensitive dimension comprise: at least one of the word frequency characteristics of the sensitive words, the number of barrage, the clicked growth amount of the target live video and the proportion of the clicked sensitive video in the object of the target live video;
inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive contents contained in the target live broadcast video;
when the sensitivity index is larger than a sensitivity threshold, determining that the target live video is a sensitive video;
the inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model and determining the sensitivity index of the target live broadcast video comprises the following steps:
determining the sensitivity index of the target live video according to a live evaluation model;
Figure DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE003
and
Figure DEST_PATH_IMAGE005
is a sensitivity index;
Figure DEST_PATH_IMAGE007
is a matrix, the
Figure 294872DEST_PATH_IMAGE007
The target live video is clicked, and at least one of the increase amount of the target live video and the proportion of the objects of the target live video in which sensitive videos are clicked is included;
Figure DEST_PATH_IMAGE009
the value of each weight in the weight matrix is obtained in the live broadcast evaluation model training process as a weight matrix, and the value is obtained
Figure 539909DEST_PATH_IMAGE009
The number of (a) and
Figure 54067DEST_PATH_IMAGE007
corresponds to the number of the
Figure 183697DEST_PATH_IMAGE011
To represent
Figure 732490DEST_PATH_IMAGE009
The transposed matrix of (2).
2. The method of claim 1, wherein said determining features characterizing each sensitive dimension from said bullet screen data comprises:
extracting a bullet screen text and bullet screen flow data from the bullet screen data;
determining the word frequency characteristics of sensitive words contained in the bullet screen text, and determining at least one of the number of bullet screens, the clicked increment of the target live video and the proportion of the clicked sensitive video in the object clicking the target live video from the bullet screen flow data.
3. The method according to any one of claims 1-2, further comprising:
acquiring barrage data of a first type of live video and barrage data of a second type of live video which are selected as samples, wherein the first type of live video is sensitive live broadcast, and the second type of live video is non-sensitive live broadcast;
and training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model.
4. The method of claim 3, wherein training an initial evaluation model according to the barrage data of the first type of live video and the barrage data of the second type of live video to obtain the live evaluation model comprises:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure 148428DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 149882DEST_PATH_IMAGE003
and
Figure 83203DEST_PATH_IMAGE005
is a sensitivity index;
Figure 112601DEST_PATH_IMAGE007
is a matrix, the
Figure 840386DEST_PATH_IMAGE007
Includes features characterizing sensitive dimensions in each live video;
Figure 329136DEST_PATH_IMAGE009
is a weight matrix; the above-mentioned
Figure 659623DEST_PATH_IMAGE009
The number of (a) and
Figure 917429DEST_PATH_IMAGE007
corresponds to the number of the
Figure 816115DEST_PATH_IMAGE011
To represent
Figure 916795DEST_PATH_IMAGE009
The transposed matrix of (2);
through the stationThe characteristics representing all sensitive dimensions in each live video are trained to obtain
Figure 926339DEST_PATH_IMAGE009
To obtain the live broadcast evaluation model.
5. The method of claim 4, wherein determining the features characterizing the sensitive dimensions in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video comprises:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
6. A method for establishing a live broadcast assessment model is characterized by comprising the following steps:
acquiring barrage data of a first type of live video and barrage data of a second type of live video which are selected as samples, wherein the first type of live video is sensitive live broadcast, and the second type of live video is non-sensitive live broadcast;
according to the barrage data of the first type of live video and the barrage data of the second type of live video, determining the characteristics of each sensitive dimension represented in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video, training an initial evaluation model according to the characteristics of each sensitive dimension represented in each live video to obtain the live evaluation model, wherein the characteristics of each sensitive dimension represented comprise: at least one of a word frequency characteristic of a sensitive word, a number of barrages, an amount of increase in the live video being clicked, and a proportion of the live video being clicked in an object of the live video, the live video including: the method comprises the steps of firstly, live broadcasting videos of a first type and live broadcasting videos of a second type;
the live evaluation model comprises:
Figure 632127DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 701714DEST_PATH_IMAGE003
and
Figure 165057DEST_PATH_IMAGE005
is a sensitivity index;
Figure 601461DEST_PATH_IMAGE007
is a matrix, the
Figure 568280DEST_PATH_IMAGE007
Including the word frequency feature and at least one of an amount of increase in the live video being clicked and a proportion of objects clicking through sensitive video in the live video;
Figure 74347DEST_PATH_IMAGE009
the value of each weight in the weight matrix is obtained in the live broadcast evaluation model training process as a weight matrix, and the value is obtained
Figure 884040DEST_PATH_IMAGE009
The number of (a) and
Figure 500967DEST_PATH_IMAGE007
corresponds to the number of the
Figure 587871DEST_PATH_IMAGE011
To represent
Figure 264840DEST_PATH_IMAGE009
The transposed matrix of (2).
7. The method according to claim 6, wherein the determining, from the barrage data of the first type of live video and the barrage data of the second type of live video, features representing sensitive dimensions in each live video according to the barrage data of the first type of live video and the barrage data of the second type of live video, and training an initial evaluation model according to the features representing sensitive dimensions in each live video to obtain the live evaluation model includes:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure 561829DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 451288DEST_PATH_IMAGE003
and
Figure 658278DEST_PATH_IMAGE005
is a sensitivity index;
Figure 365203DEST_PATH_IMAGE007
is a matrix, the
Figure 24855DEST_PATH_IMAGE007
Including in each live video representation of sensitive dimensionsCharacteristic;
Figure 983583DEST_PATH_IMAGE009
is a weight matrix; the above-mentioned
Figure 405600DEST_PATH_IMAGE009
The number of (a) and
Figure 424371DEST_PATH_IMAGE007
corresponds to the number of the
Figure 305740DEST_PATH_IMAGE011
To represent
Figure 192793DEST_PATH_IMAGE009
The transposed matrix of (2);
obtaining the characteristic training representing each sensitive dimension in each live video
Figure 108797DEST_PATH_IMAGE009
To obtain the live broadcast evaluation model.
8. The method of claim 7, wherein determining the features characterizing the sensitive dimensions in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video comprises:
respectively extracting training barrage texts and training barrage flow data in each live video from the first type of live barrage data and the second type of live barrage data;
determining the word frequency characteristics of sensitive words contained in the training barrage text, and determining at least one of the number of barrages used for training a model, the clicked increment of each live video and the clicked proportion of the sensitive video in the object of each live video from the training barrage flow data.
9. An apparatus for live video assessment, comprising:
the acquisition program module is used for acquiring barrage data of the target live video;
a first determining program module, configured to determine, from the bullet screen data acquired by the acquiring program module, features characterizing each sensitive dimension, where the features characterizing each sensitive dimension include: at least one of the word frequency characteristics of the sensitive words, the number of barrage, the clicked growth amount of the target live video and the proportion of the clicked sensitive video in the object of the target live video;
the second determining program module is used for inputting the characteristics of the sensitive dimensions determined by the first determining program module into a live broadcast evaluation model, and determining the sensitivity index of the target live broadcast video, wherein the sensitivity index reflects the proportion of sensitive content contained in the target live broadcast video; the inputting the characteristics of the sensitive dimensions into a live broadcast evaluation model and determining the sensitivity index of the target live broadcast video comprises the following steps: determining the sensitivity index of the target live video according to a live evaluation model;
Figure 32890DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 526188DEST_PATH_IMAGE003
and
Figure 826720DEST_PATH_IMAGE005
is a sensitivity index;
Figure 862809DEST_PATH_IMAGE007
is a matrix, the
Figure 957804DEST_PATH_IMAGE007
ComprisesThe word frequency characteristics and at least one of the increase amount of the clicked target live broadcast video and the proportion of the clicked sensitive video in the object of the clicked target live broadcast video;
Figure 203977DEST_PATH_IMAGE009
the value of each weight in the weight matrix is obtained in the live broadcast evaluation model training process as a weight matrix, and the value is obtained
Figure 308200DEST_PATH_IMAGE009
The number of (a) and
Figure 667637DEST_PATH_IMAGE007
corresponds to the number of the
Figure 822281DEST_PATH_IMAGE011
To represent
Figure 165538DEST_PATH_IMAGE009
The transposed matrix of (2);
and the third determining program module is used for determining that the target live video is a sensitive video when the sensitivity index determined by the second determining program module is greater than a sensitivity threshold.
10. The apparatus of claim 9,
the first determining program module is for:
extracting a bullet screen text and bullet screen flow data from the bullet screen data;
determining the word frequency characteristics of sensitive words contained in the bullet screen text, and determining at least one of the number of bullet screens, the clicked increment of the target live video and the proportion of the clicked sensitive video in the object clicking the target live video from the bullet screen flow data.
11. An apparatus for direct broadcast assessment model establishment, comprising:
the system comprises an acquisition program module, a storage module and a display module, wherein the acquisition program module is used for acquiring barrage data of a first type of live video and barrage data of a second type of live video which are selected as samples, the first type of live video is sensitive live broadcast, and the second type of live video is non-sensitive live broadcast;
the model training program module is used for determining the characteristics of each sensitive dimension represented in each live video from the bullet screen data of the first type of live video and the bullet screen data of the second type of live video according to the bullet screen data of the first type of live video and the bullet screen data of the second type of live video obtained by the obtaining program module, training an initial assessment model according to the characteristics of each sensitive dimension represented in each live video to obtain the live assessment model, and the characteristics of each sensitive dimension represented comprise: at least one of a word frequency characteristic of a sensitive word, a number of barrages, an amount of increase in the live video being clicked, and a proportion of the live video being clicked in an object of the live video, the live video including: the method comprises the steps of firstly, live broadcasting videos of a first type and live broadcasting videos of a second type;
the live evaluation model comprises:
Figure 73451DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 818553DEST_PATH_IMAGE003
and
Figure 379984DEST_PATH_IMAGE005
is a sensitivity index;
Figure 210537DEST_PATH_IMAGE007
is a matrix, the
Figure 515617DEST_PATH_IMAGE007
Comprises the word frequency characteristics and,At least one of an amount of increase in the live video being clicked and a proportion of objects of the live video that have clicked sensitive video;
Figure 115225DEST_PATH_IMAGE009
the value of each weight in the weight matrix is obtained in the live broadcast evaluation model training process as a weight matrix, and the value is obtained
Figure 722924DEST_PATH_IMAGE009
The number of (a) and
Figure 165407DEST_PATH_IMAGE007
corresponds to the number of the
Figure 149543DEST_PATH_IMAGE011
To represent
Figure 603658DEST_PATH_IMAGE009
The transposed matrix of (2).
12. The apparatus of claim 11,
the model training program module is used for:
determining the characteristics representing each sensitive dimension in each live video from the barrage data of the first type of live video and the barrage data of the second type of live video;
training the following initial evaluation model by using the characteristics representing the sensitive dimensions in each live video;
the initial evaluation model is:
Figure 273936DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 79081DEST_PATH_IMAGE003
and
Figure 866909DEST_PATH_IMAGE005
is a sensitivity index;
Figure 300164DEST_PATH_IMAGE007
is a matrix, the
Figure 515245DEST_PATH_IMAGE007
Includes features characterizing sensitive dimensions in each live video;
Figure 542107DEST_PATH_IMAGE009
is a weight matrix; the above-mentioned
Figure 133625DEST_PATH_IMAGE009
The number of (a) and
Figure 421387DEST_PATH_IMAGE007
corresponds to the number of the
Figure 807369DEST_PATH_IMAGE011
To represent
Figure 587106DEST_PATH_IMAGE009
The transposed matrix of (2);
obtaining the characteristic training representing each sensitive dimension in each live video
Figure 841370DEST_PATH_IMAGE009
To obtain the live broadcast evaluation model.
13. A computer device, comprising: an input/output (I/O) interface, a processor, and a memory having stored therein instructions for live video evaluation as recited in any of claims 1-5;
the processor is configured to execute instructions stored in the memory for live video rating to perform the steps of the method for live video rating as claimed in any of claims 1-5.
14. A computer device, comprising: an input/output (I/O) interface, a processor, and a memory having stored therein instructions for the live evaluation model creation of any of claims 6-8;
the processor is configured to execute instructions stored in the memory for a live evaluation model building, to perform the steps of the method for live evaluation model building as claimed in any of claims 6-8.
15. A computer-readable storage medium having stored thereon a computer-executable program which, when loaded and executed by a processor, implements a method of live video rating as claimed in any of claims 1 to 5 and/or a method of live rating model building as claimed in any of claims 6 to 8.
CN201710380359.XA 2017-05-25 2017-05-25 Live video evaluation method, model establishment method, device and equipment Active CN108965916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710380359.XA CN108965916B (en) 2017-05-25 2017-05-25 Live video evaluation method, model establishment method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710380359.XA CN108965916B (en) 2017-05-25 2017-05-25 Live video evaluation method, model establishment method, device and equipment

Publications (2)

Publication Number Publication Date
CN108965916A CN108965916A (en) 2018-12-07
CN108965916B true CN108965916B (en) 2021-05-25

Family

ID=64494458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710380359.XA Active CN108965916B (en) 2017-05-25 2017-05-25 Live video evaluation method, model establishment method, device and equipment

Country Status (1)

Country Link
CN (1) CN108965916B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109327715B (en) * 2018-08-01 2021-06-04 创新先进技术有限公司 Video risk identification method, device and equipment
CN111382383A (en) * 2018-12-28 2020-07-07 广州市百果园信息技术有限公司 Method, device, medium and computer equipment for determining sensitive type of webpage content
CN111147876A (en) * 2019-12-26 2020-05-12 山东爱城市网信息技术有限公司 Live broadcast content monitoring method, equipment, storage medium and platform based on block chain
CN112055230A (en) * 2020-09-03 2020-12-08 北京中润互联信息技术有限公司 Live broadcast monitoring method and device, computer equipment and readable storage medium
CN114598899B (en) * 2022-03-15 2023-06-16 中科大数据研究院 Barrage broadcasting analysis method based on crawlers
CN114840477B (en) * 2022-06-30 2022-09-27 深圳乐播科技有限公司 File sensitivity index determining method based on cloud conference and related product
CN116228320B (en) * 2023-03-01 2024-02-06 广州网优优数据技术股份有限公司 Live advertisement putting effect analysis system and method
CN117395470A (en) * 2023-08-31 2024-01-12 江苏初辰文化发展有限公司 Live broadcast content evaluation detection method based on barrage sharing

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8191152B1 (en) * 2009-01-23 2012-05-29 Intuit Inc. Methods systems and articles of manufacture for generating and displaying video related to financial application
CN103218608A (en) * 2013-04-19 2013-07-24 中国科学院自动化研究所 Network violent video identification method
CN104918066A (en) * 2014-03-11 2015-09-16 上海数字电视国家工程研究中心有限公司 Video content censoring method and system
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN105872773A (en) * 2016-06-01 2016-08-17 北京奇虎科技有限公司 Video broadcast monitoring method and device
CN105930411A (en) * 2016-04-18 2016-09-07 苏州大学 Classifier training method, classifier and sentiment classification system
CN105956550A (en) * 2016-04-29 2016-09-21 浪潮电子信息产业股份有限公司 Video discriminating method and device
CN106250837A (en) * 2016-07-27 2016-12-21 腾讯科技(深圳)有限公司 The recognition methods of a kind of video, device and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8191152B1 (en) * 2009-01-23 2012-05-29 Intuit Inc. Methods systems and articles of manufacture for generating and displaying video related to financial application
CN103218608A (en) * 2013-04-19 2013-07-24 中国科学院自动化研究所 Network violent video identification method
CN104918066A (en) * 2014-03-11 2015-09-16 上海数字电视国家工程研究中心有限公司 Video content censoring method and system
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN105930411A (en) * 2016-04-18 2016-09-07 苏州大学 Classifier training method, classifier and sentiment classification system
CN105956550A (en) * 2016-04-29 2016-09-21 浪潮电子信息产业股份有限公司 Video discriminating method and device
CN105872773A (en) * 2016-06-01 2016-08-17 北京奇虎科技有限公司 Video broadcast monitoring method and device
CN106250837A (en) * 2016-07-27 2016-12-21 腾讯科技(深圳)有限公司 The recognition methods of a kind of video, device and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《重磅:286页一天搞懂深度学习ppt》;龚勋;《豆丁网》;20160906;第11页 *

Also Published As

Publication number Publication date
CN108965916A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108965916B (en) Live video evaluation method, model establishment method, device and equipment
US11184241B2 (en) Topology-aware continuous evaluation of microservice-based applications
US10939075B2 (en) Monitoring target object by multiple cameras
US9710122B1 (en) Customer support interface
WO2018058811A1 (en) Virtual reality scene loading method and device
US9906542B2 (en) Testing frequency control using a volatility score
US11237948B2 (en) Rendering engine component abstraction system
US11349724B2 (en) Predictive analysis in a software defined network
CN111860377A (en) Live broadcast method and device based on artificial intelligence, electronic equipment and storage medium
CN113037545A (en) Network simulation method, device, equipment and storage medium
US9560110B1 (en) Synchronizing shared content served to a third-party service
US11721099B2 (en) Cloud based active commissioning system for video analytics
CN114157568A (en) Browser security access method, device, equipment and storage medium
CN114184885A (en) Fault detection method, device and storage medium
US20220060371A1 (en) Fault localization for cloud-native applications
CN112134968A (en) Domain name access method and device, electronic equipment and storage medium
CN116846768A (en) Display method and device for network topology structure and electronic equipment
CN108073803A (en) For detecting the method and device of malicious application
CN117897738A (en) Artificial intelligence assisted live sports data quality assurance
CN115037790A (en) Abnormal registration identification method, device, equipment and storage medium
CN113395234B (en) Method and device for detecting flow hijacking of popularization information
CN113656314A (en) Pressure test processing method and device
US20220237238A1 (en) Training device, determination device, training method, determination method, training method, and determination program
CN108810230B (en) Method, device and equipment for acquiring incoming call prompt information
CN112434237A (en) Page loading method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant