CN111770353A

CN111770353A - Live broadcast monitoring method and device, electronic equipment and storage medium

Info

Publication number: CN111770353A
Application number: CN202010589239.2A
Authority: CN
Inventors: 周杰; 王鸣辉; 孙振邦; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2020-10-13

Abstract

The present disclosure provides a live broadcast monitoring method, apparatus, electronic device and storage medium, wherein the live broadcast monitoring method includes: acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods; determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each time period in the continuous multiple time periods of the to-be-detected live broadcast room; determining a safety detection result corresponding to the to-be-detected live broadcast room based on the fusion live broadcast feature vector; and carrying out live broadcast control based on the safety detection result. The embodiment of the disclosure improves the accuracy of the live broadcast monitoring result.

Description

Live broadcast monitoring method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a live broadcast monitoring method and apparatus, an electronic device, and a storage medium.

Background

With the continuous development of internet technology, live broadcast technology operates, a live broadcast platform provides a plurality of live broadcast rooms, and a user can watch live broadcast video streams sent by a main broadcast in the current live broadcast room after entering the live broadcast rooms.

In the live broadcast room, a user may issue some inappropriate speeches or make some inappropriate behaviors in the live broadcast room, and in order to manage this, the live broadcast management platform needs to monitor the speeches and behaviors of the user in each live broadcast room so as to achieve management of each live broadcast room.

In the related art, when the live broadcasting room is monitored, the audio or the live broadcasting picture of the live broadcasting room to be detected can be randomly extracted to monitor the live broadcasting room, and when the live broadcasting room to be detected is monitored based on the mode, the accuracy of the obtained live broadcasting monitoring result is low.

Disclosure of Invention

The embodiment of the present disclosure provides at least a live broadcast monitoring scheme to improve the accuracy of a live broadcast monitoring result.

In a first aspect, an embodiment of the present disclosure provides a live broadcast monitoring method, including:

acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods; determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each time period in the continuous multiple time periods of the to-be-detected live broadcast room; determining a safety detection result corresponding to the to-be-detected live broadcast room based on the fusion live broadcast feature vector; and carrying out live broadcast control based on the safety detection result.

In a possible implementation manner, the acquiring a live broadcast feature vector of a live broadcast room to be detected in each of a plurality of consecutive time periods under a plurality of preset security feature dimensions includes:

acquiring live broadcast content of the live broadcast room to be detected in each of the continuous multiple time periods, wherein the live broadcast content comprises live broadcast pictures and/or live broadcast audio, and extracting live broadcast content characteristics corresponding to the live broadcast content under the preset security characteristic dimensions of multiple live broadcast contents; and splicing the live broadcast content characteristics corresponding to each time period of the live broadcast room to be detected, and the historical behavior characteristics and the user attribute characteristics corresponding to the time period of the live broadcast room to be detected to obtain the live broadcast characteristic vector of the live broadcast room to be detected in the time period.

In a possible implementation manner, the determining, based on the live broadcast feature vector corresponding to each of the multiple consecutive time periods of the to-be-detected live broadcast room, a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room includes:

for each time segment in the continuous multiple time segments, determining a continuous multiple target time segments corresponding to the time segment, wherein the continuous multiple target time segments comprise the time segment; fusing live broadcast feature vectors respectively corresponding to a plurality of continuous target time periods corresponding to the time periods to obtain first fused live broadcast feature vectors corresponding to the time periods; and determining the corresponding fusion live broadcast feature vector of the to-be-detected live broadcast room based on the corresponding first fusion live broadcast feature vector of the to-be-detected live broadcast room in each time period.

In a possible implementation manner, the determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on a first fusion live broadcast feature vector corresponding to each time period in the to-be-detected live broadcast room includes:

from the non-first time period in the continuous multiple time periods, fusing a first fused live broadcast feature vector corresponding to the current time period and a memory live broadcast feature vector corresponding to the last time period of the current time period to obtain a second fused live broadcast feature vector corresponding to the current time period of the to-be-detected live broadcast room; determining a memory live broadcast feature vector corresponding to the current time period based on a second fusion live broadcast feature vector corresponding to the current time period, and fusing the memory live broadcast feature vector corresponding to the current time period and a first fusion live broadcast feature vector corresponding to a next time period of the current time period to obtain a second fusion live broadcast feature vector corresponding to the next time period of the to-be-detected live broadcast room; and judging whether the next time period is the last time period in the continuous multiple time periods, if so, taking a second fusion live broadcast feature vector corresponding to the next time period as a fusion live broadcast feature vector corresponding to the live broadcast room to be detected, and if not, taking the next time period as the current time period, and executing the step of determining the second fusion live broadcast feature vector corresponding to the current time period in the live broadcast room to be detected.

In a possible implementation manner, performing live broadcast control based on the security detection result includes:

and if the safety detection result indicates that the live broadcast content of the to-be-detected live broadcast room does not accord with the preset safety detection condition, outputting the identification corresponding to the to-be-detected live broadcast room and the live broadcast content corresponding to the to-be-detected live broadcast room.

In one possible embodiment, the safety detection result is determined by a pre-trained neural network;

the neural network utilizes live broadcast characteristic vectors of each sample live broadcast room in a plurality of sample live broadcast rooms in continuous time periods and under multiple preset safety characteristic dimensions, and pre-labeled safety detection results corresponding to each sample live broadcast room.

In one possible embodiment, the neural network is trained in the following manner:

acquiring a live broadcast feature vector of each sample live broadcast room in each time period of a plurality of continuous time periods and under a plurality of preset security feature dimensions; determining a fusion live broadcast feature vector corresponding to each sample live broadcast room based on the live broadcast feature vector corresponding to each time slot in the continuous multiple time slots; predicting a safety detection result corresponding to the sample live broadcast room based on the fusion live broadcast feature vector corresponding to the sample live broadcast room; and adjusting the network parameter values in the neural network based on the predicted safety detection result corresponding to each sample live broadcast room and the actual safety detection result corresponding to the sample live broadcast room.

In a second aspect, an embodiment of the present disclosure provides a live broadcast monitoring apparatus, including:

the acquisition module is used for acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods; the first determining module is used for determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each time period in the continuous multiple time periods of the to-be-detected live broadcast room; the second determining module is used for determining a safety detection result corresponding to the to-be-detected live broadcast room based on the fusion live broadcast feature vector; and the control module is used for carrying out live broadcast control based on the safety detection result.

In a possible implementation manner, when the obtaining module is configured to obtain, in each of a plurality of consecutive time periods, live broadcast feature vectors in a plurality of preset security feature dimensions of a live broadcast room to be detected, the obtaining module includes:

In a possible implementation manner, when the first determining module is configured to determine, based on a live broadcast feature vector corresponding to each of the multiple consecutive time periods of the to-be-detected live broadcast room, a fused live broadcast feature vector corresponding to the to-be-detected live broadcast room, the determining module includes:

In a possible implementation manner, when the first determining module is configured to determine the fused live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the first fused live broadcast feature vector corresponding to each time period in the to-be-detected live broadcast room, the first determining module includes:

In a possible embodiment, the control module, when configured to perform live broadcast control based on the security detection result, includes:

In a possible implementation manner, the live broadcast monitoring apparatus further includes a network training module, where the network training module is configured to train a neural network that determines the security detection result;

In one possible embodiment, the network training module is configured to train the neural network in the following manner:

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the live monitoring method according to the first aspect.

In a fourth aspect, the present disclosure provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the live broadcast monitoring method according to the first aspect.

The embodiment of the disclosure provides a live broadcast monitoring scheme, by acquiring a live broadcast feature vector of a live broadcast room to be detected in each time period of a plurality of continuous time periods, under various preset safety feature dimensions, for example, under each time period of 10 continuous time periods, the live broadcast feature vector of the live broadcast room to be detected under various safety feature dimensions is obtained, so that a fused live broadcast feature vector corresponding to the live broadcast room to be detected is determined through the live broadcast feature vector corresponding to each time period of the live broadcast room to be detected in the continuous time periods, namely, a live broadcast user in the live broadcast room to be detected can be monitored through the duration, a safety detection result corresponding to the live broadcast room to be detected can be obtained more accurately, then the live broadcast is controlled through the safety detection result, and therefore the live broadcast environment is effectively improved.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 shows a flowchart of a live broadcast monitoring method provided by an embodiment of the present disclosure;

fig. 2 shows a flowchart of a method for determining a fused live broadcast feature vector corresponding to a to-be-detected live broadcast room according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a structure of a time dimension convolutional layer in a neural network provided by an embodiment of the present disclosure;

fig. 4 is a flowchart illustrating another method for determining a fused live broadcast feature vector corresponding to a live broadcast room to be detected according to the embodiment of the present disclosure;

FIG. 5 is a diagram illustrating a specific process of a security detection result provided by an embodiment of the present disclosure;

FIG. 6 is a flow chart of a method for training a neural network provided by an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram illustrating a live broadcast monitoring apparatus provided in an embodiment of the present disclosure;

fig. 8 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In a live broadcast scene, such as a live broadcast room, a user may make some inappropriate speech or some inappropriate behavior in the live broadcast room, and in order to manage this, the live broadcast management platform needs to monitor the speech and behavior of the user in each live broadcast room so as to manage each live broadcast room.

Live broadcast monitoring mode among the correlation technique is comparatively simple and mechanical mode usually, for example monitor the live broadcast room along with the audio frequency or the live broadcast picture of waiting to detect the live broadcast room, when treating to detect the live broadcast room and monitor based on this mode, the degree of accuracy of control is lower.

In view of the above, the embodiments of the present disclosure provide a live broadcast monitoring scheme, which obtains, in each of a plurality of consecutive time periods of a live broadcast room to be detected, live broadcast feature vectors under multiple preset security feature dimensions are obtained, for example, under each time period of ten continuous time periods, live broadcast feature vectors of a live broadcast room to be detected under multiple security feature dimensions are obtained, thus, the fusion live broadcast characteristic vector corresponding to the live broadcast room to be detected is determined through the live broadcast characteristic vector corresponding to each time period in a plurality of continuous time periods in the live broadcast room to be detected, namely, the live broadcast user in the live broadcast room to be detected can be monitored through the duration, the corresponding safety detection result of the live broadcast room to be detected can be more accurately obtained, and then the live broadcast is controlled through the safety detection result, so that the live broadcast environment is effectively improved.

To facilitate understanding of the present embodiment, first, a live broadcast monitoring method disclosed in the embodiments of the present disclosure is described in detail, where an execution subject of the live broadcast monitoring method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a server or other processing device, the live monitoring method may be implemented by a processor invoking computer-readable instructions stored in a memory.

Referring to fig. 1, a flowchart of a live broadcast monitoring method provided in the embodiment of the present disclosure is shown, where the live broadcast monitoring method specifically includes the following steps S101 to S104.

S101, acquiring live broadcast feature vectors of a live broadcast room to be detected in each time period of a plurality of continuous time periods and under a plurality of preset safety feature dimensions.

The live broadcast room to be detected may be a virtual room in which live broadcast is performed, for example, a virtual room in which live broadcast is performed through a set client, and the live broadcast room may correspond to a live broadcast room identifier, where the identifier may be a user name, a mobile phone number, or another account number of a user who performs live broadcast.

The time period may be a set time period from a set time, for example, the set time is 1min after the live broadcast starts in the live broadcast room, each time period includes the set time period of 20s, if the live broadcast start time in the live broadcast room is 9:00, the first time period may be 9:01:00 to 9:01:20, and the second time period may be 9:01:20 to 9:01: 40.

Here, the security feature dimension may include multiple dimensions for evaluating the live broadcast effect, and particularly may include a feature dimension for evaluating whether a live broadcast has a risk, for example, the feature dimension may include dimensions for evaluating whether an anchor of a live broadcast room has an inappropriate behavior, an inappropriate statement, whether a live broadcast room is reported in a history stage, the number of times of the report, and an attribute feature of the anchor, and based on the multiple security feature dimensions, a live broadcast feature vector corresponding to a live broadcast room to be detected may be obtained, and a specific obtaining manner will be described in detail in the following text, which is not described herein in detail.

S102, determining a fusion live broadcast feature vector corresponding to the live broadcast room to be detected based on the live broadcast feature vector corresponding to each time period in a plurality of continuous time periods of the live broadcast room to be detected.

Here, each time period of the live broadcast room to be detected corresponds to one live broadcast feature vector, which can be used to evaluate the security of the live broadcast room to be detected in the time period, for example, whether a risk exists in the time period, the live broadcast room to be detected is continuously monitored in a plurality of time periods, and whether a risk exists in the live broadcast room can be more accurately monitored through the accumulation of time information, for example, a main broadcast of the live broadcast room to be detected has some slight violation behaviors in a single time period, if only considered in the single time period, the risk of the live broadcast room to be detected cannot be considered, but if the violation behaviors of the main broadcast continuously aggravate along with the time, the risk of the live broadcast room to be detected may become higher, so that the fused live broadcast feature vector corresponding to the live broadcast room to be detected is comprehensively considered through the live broadcast feature vectors corresponding to the plurality of time, whether risks exist in the live broadcast room to be detected or not can be accurately determined.

S103, determining a safety detection result corresponding to the live broadcast room to be detected based on the fusion live broadcast feature vector.

The safety detection result here may include whether a risk exists in the to-be-detected live broadcast room, specifically may be represented by a risk score, and may also be represented by a risk grade, for example, a risk score corresponding to the to-be-detected live broadcast room may be determined by fusing live broadcast feature vectors, specifically, the risk score may be divided into 0 to 1, a risk score threshold may be preset, for example, 0.7 is set as the risk score threshold, that is, when the risk score corresponding to the to-be-detected live broadcast room reaches 0.7 time, it may be determined that the possibility that a risk exists in the to-be-detected live broadcast room is high; or the security detection result here may also be represented by a risk level, for example, a risk score corresponding to the live broadcast room to be detected may be determined first by fusing live broadcast feature vectors, and then a risk level corresponding to the risk score is determined based on the risk score, for example, 0 to 0.4 is a low risk level, 0.4 to 0.7 is a medium risk level, and 0.7 to 1 is a high risk level, and if the risk level corresponding to the live broadcast room to be detected is obtained as the high risk level, it may be determined that the risk is highly likely to exist in the live broadcast room to be detected.

And S104, performing live broadcast control based on the safety detection result.

Here, performing live broadcast control based on the security detection result may include:

and if the safety detection result indicates that the live broadcast content of the live broadcast room to be detected does not accord with the preset safety detection condition, outputting the identification corresponding to the live broadcast room to be detected and the live broadcast content corresponding to the live broadcast room to be detected.

Specifically, the safety detection condition corresponds to the safety detection result, for example, if the safety detection result is represented by a risk score, the preset safety condition is that the risk score is lower than a risk score threshold, and if the safety detection result is represented by a risk level, the preset safety condition may be that the risk level is lower than a high risk level.

The identification of the to-be-detected live broadcast room can be the to-be-detected live broadcast room ID, the live broadcast content corresponding to the to-be-detected live broadcast room can comprise live broadcast pictures and/or live broadcast audio corresponding to the to-be-detected live broadcast room in a plurality of continuous time periods, the identification of the to-be-detected live broadcast room and the live broadcast content corresponding to the to-be-detected live broadcast room are output and can be output to a client side corresponding to an operator, and the operator can further perform security detection on the to-be-detected live broadcast pictures and/or the live broadcast audio corresponding to the plurality of continuous time periods based on the to-.

In the above S101 to S104, the embodiment of the present disclosure provides a live broadcast monitoring scheme, by acquiring that in each of a plurality of consecutive time periods of a live broadcast room to be detected, under the preset live broadcast characteristic vectors under various security characteristic dimensions, for example, under each time period in continuous 10 time periods, the live broadcast characteristic vectors of the live broadcast room to be detected under various security characteristic dimensions are obtained, thus, the fusion live broadcast characteristic vector corresponding to the live broadcast room to be detected is determined through the live broadcast characteristic vector corresponding to each time period in a plurality of continuous time periods in the live broadcast room to be detected, namely, the live broadcast user in the live broadcast room to be detected can be monitored through the duration, the corresponding safety detection result of the live broadcast room to be detected can be more accurately obtained, and then the live broadcast is controlled through the safety detection result, so that the live broadcast environment is effectively improved.

In addition, here after the security testing result that the live broadcast room that awaits measuring corresponds is determined automatically, the higher live broadcast room that waits of risk nature is checked to the rethread manual work, like this, need not the manual work and all wait to detect and all monitor the live broadcast room, reduced the cost of labor, improved live broadcast monitoring efficiency.

The following will specifically describe the above S101 to S104 with reference to specific embodiments:

for the above S101, when acquiring the live broadcast feature vector of the to-be-detected live broadcast room in each of the consecutive multiple time periods and under multiple preset security feature dimensions, the method may include:

(1) acquiring live broadcast content of a to-be-detected live broadcast room in each of a plurality of continuous time periods, wherein the live broadcast content comprises live broadcast pictures and/or live broadcast audio, and extracting live broadcast content characteristics corresponding to the live broadcast content under a plurality of preset live broadcast content safety characteristic dimensions;

(2) and splicing the live broadcast content characteristics corresponding to each time period of the live broadcast room to be detected, and the historical behavior characteristics and the user attribute characteristics corresponding to the time period of the live broadcast room to be detected to obtain the live broadcast characteristic vector of the live broadcast room to be detected in the time period.

When acquiring the live broadcast picture in each time slot of a continuous multiple time slots of the live broadcast room to be detected, the live broadcast picture corresponding to each time slot can be obtained in a manner of extracting one frame of live broadcast picture at set time intervals, for example, for the time slot of 0-20 s, 10 frames of live broadcast pictures corresponding to the time slot of 0-20 s can be obtained in a manner of extracting one frame of live broadcast picture at 2s intervals.

When the live audio of the to-be-detected live broadcast room in each of the continuous multiple time periods is obtained, the live audio corresponding to each time period can be obtained according to a mode of extracting the audio with a set duration in each time period, for example, the live audio of 10 seconds is extracted for a time period of 0 to 20s, and the live audio corresponding to the time period of 0 to 20s is obtained.

After the live content corresponding to each time period is obtained, the network can be detected according to the safety of various pre-trained live content, and the live content characteristics corresponding to the live content corresponding to each time period under the preset safety characteristic dimensions of various live content are obtained.

For example, the pre-trained multiple live content security detection networks comprise a detection network aiming at behavior risk characteristics, a detection network aiming at barrage risk characteristics, a detection network aiming at main speech risk characteristics and the like, live content is respectively output to different live content security detection networks, the characteristics of each live broadcast content corresponding to the live broadcast content can be obtained, and particularly, for the condition that the same time period corresponds to multiple frames of live broadcast pictures, when determining the live content feature corresponding to a certain live content security feature dimension corresponding to the time period, each frame of live broadcast picture can be respectively input into a live broadcast content security detection network corresponding to the live broadcast content security feature dimension, thereby obtaining the corresponding score of each frame of live broadcast picture under the security feature dimension of the live broadcast content, and then selecting the highest score from the scores corresponding to each frame of live broadcast picture as the live broadcast content feature of the security feature dimension of the live broadcast content in the time period.

Besides the obtained live broadcast content characteristics, historical behavior characteristics and user attribute characteristics corresponding to each time period of a live broadcast room to be detected can be obtained, wherein the historical behavior characteristics can include the times of reporting of behavior violation or language violation in the historical time period of the live broadcast room, the user attribute characteristics can include the characteristics of gender, age and the like of the anchor broadcast of the live broadcast room to be detected, and the user attribute characteristics can also influence the security detection result, for example, the risk of fighting a shelf between live broadcast rooms of young women is low based on the display of big data statistical results, and the risk of fighting a shelf between live broadcast rooms to be detected is low when the included user attribute characteristics are young women.

Further, after obtaining live broadcast content characteristics corresponding to each time period of the live broadcast room to be detected, and historical behavior characteristics and user attribute characteristics corresponding to the time period of the live broadcast room to be detected, the live broadcast content characteristics, the historical behavior characteristics and the user attribute characteristics are spliced, so that live broadcast feature vectors of the live broadcast room to be detected in the time period can be obtained, for example, corresponding live broadcast content characteristics of 100 live broadcast content characteristics are obtained in each time period of the live broadcast room to be detected, the historical behavior characteristics corresponding to the time period are 15 live broadcast content characteristics, and the user attribute characteristics corresponding to the time period are 17 live broadcast feature vectors, corresponding to each time period, of the live broadcast room to be detected have 132 feature values.

In the embodiment of the disclosure, live content characteristics corresponding to live content under preset multiple live content security characteristic dimensions are extracted, live content characteristics corresponding to each time period in a live room to be detected are further spliced with historical behavior characteristics and user attribute characteristics corresponding to the time period in the live room to be detected, a multi-angle live characteristic vector representing whether the live room to be detected is safe or not can be obtained, the live room to be detected is monitored from multiple angles, and whether safety problems exist in the live room to be detected or not is effectively monitored in a later period.

For the above S102, when determining the fused live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each of the multiple continuous time periods in the to-be-detected live broadcast room, as shown in fig. 2, the method specifically includes the following S201 to S203:

s201, aiming at each time segment in a plurality of continuous time segments, determining a plurality of continuous target time segments corresponding to the time segment, wherein the time segments are included in the plurality of continuous target time segments;

s202, fusing live broadcast feature vectors respectively corresponding to a plurality of continuous target time periods corresponding to the time periods to obtain first fused live broadcast feature vectors corresponding to the time periods;

s203, determining the fusion live broadcast feature vector corresponding to the live broadcast room to be detected based on the first fusion live broadcast feature vector corresponding to each time period of the live broadcast room to be detected.

It can be understood here that the live broadcast feature vectors are first fused, and during the fusion, the live broadcast feature vectors may be fused according to a time dimension convolutional layer in a pre-trained neural network, where the pre-trained neural network may be a pre-trained neural network for determining a security detection result according to an embodiment of the present disclosure, and the neural network will be described in detail later, where for each time period, a plurality of target time periods corresponding to the time period may be determined according to the pre-trained time dimension convolutional layer, for example, a plurality of target time periods corresponding to each time period determined according to the time dimension convolutional layer herein includes the time period, a preset number of time periods before the time period, and a preset number of time periods after the time period.

For example, the continuous multiple time periods may include ten continuous time periods, and if it is determined by the time convolution layer that the target time period corresponding to each of the ten time periods includes three time periods, for example, the target time period corresponding to the second time period is a first time period, a second time period, and a third time period, the first fused live broadcast feature vector corresponding to the second time period is obtained by fusing the live broadcast feature vector corresponding to the first time period, the live broadcast feature vector corresponding to the second time period, and the live broadcast feature vector corresponding to the third time period.

Specifically, for a first time period in a plurality of consecutive time periods, a live broadcast feature vector corresponding to a preset number of time periods before the first time period may be a preset live broadcast feature vector.

Specifically, the time dimension convolutional layer may include multiple layers, and a first fused live feature vector corresponding to each time period may be finally obtained after fusion of the multiple time dimension convolutional layers, as shown in fig. 3, by taking an example that a pre-trained neural network includes two time dimension convolutional layers and an example that a plurality of continuous time periods include ten time periods, how to fuse live feature vectors corresponding to a plurality of continuous target time periods corresponding to each time period respectively is described:

inputting live broadcast feature vectors corresponding to ten continuous time periods into a neural network, if the receptive field of each neuron in a first-layer time dimension convolutional layer is 3, taking each neuron in the first-layer time dimension convolutional layer as an example, the live broadcast feature vectors corresponding to three time periods associated with the neuron can be fused, when fusing, determining target feature values to be fused in the live broadcast feature vectors, for example, if the target feature values to be fused are the 1 st, 2 nd and 5 th feature values, when fusing, fusing the target feature values according to a predetermined fusion weight corresponding to each target feature value, and taking the fused feature values as new feature values, namely, obtaining the fusion live broadcast feature vectors corresponding to each time period at present after the fusion of the first-layer time dimension convolutional layer, and then continuing to follow the same mode, and obtaining a first fused live broadcast feature vector corresponding to each time period finally through the fusion of the second-layer time dimension convolutional layer, wherein the reception fields corresponding to the second-layer time dimension convolutional layer comprise 5, as shown in fig. 3, the fusion of the currently corresponding fused live broadcast feature vectors in 5 continuous time periods is performed, and the first fused live broadcast feature vector corresponding to each time period finally is obtained after the fusion.

In the embodiment of the disclosure, for each time slot, live broadcast feature vectors corresponding to a plurality of time slots associated with the time slot are fused to obtain a first fused live broadcast feature vector corresponding to each time slot, and in the fusion process, feature values of different dimensions in the live broadcast feature vectors can be updated after being fused in combination with the time slots, so that a first fused live broadcast feature vector more suitable for safety monitoring of a live broadcast room to be detected is obtained.

After the first fusion live broadcast feature vector corresponding to each time period is obtained, the fusion live broadcast feature vector corresponding to the live broadcast room to be detected can be determined based on the first fusion live broadcast feature vector corresponding to each time period in the live broadcast room to be detected, as shown in fig. 4, the method specifically includes the following steps S401 to S403:

s401, starting from a non-first time period in a plurality of continuous time periods, fusing a first fused live broadcast feature vector corresponding to a current time period and a memory live broadcast feature vector corresponding to a previous time period of the current time period to obtain a second fused live broadcast feature vector corresponding to a current time period of a to-be-detected live broadcast room;

s402, determining a memory live broadcast feature vector corresponding to the current time period based on a second fusion live broadcast feature vector corresponding to the current time period, and fusing the memory live broadcast feature vector corresponding to the current time period and a first fusion live broadcast feature vector corresponding to the next time period of the current time period to obtain a second fusion live broadcast feature vector corresponding to the next time period of a to-be-detected live broadcast room;

and S403, judging whether the next time period is the last time period in a plurality of continuous time periods, if so, taking the second fusion live broadcast feature vector corresponding to the next time period as the fusion live broadcast feature vector corresponding to the live broadcast room to be detected, and if not, taking the next time period as the current time period, and executing the step of determining the second fusion live broadcast feature vector corresponding to the current time period in the live broadcast room to be detected.

The process of S401 to S403 is a process of determining the fused live broadcast feature vector corresponding to the live broadcast room to be detected, in the process, a Long-Short Term Memory network (LSTM) is introduced, the fused live broadcast feature vector corresponding to each live broadcast room to be detected is determined by a Long-Short Term Memory network in a pre-trained neural network, and the process of determining the fused live broadcast feature vector corresponding to the live broadcast room to be detected is as follows for one live broadcast room to be detected:

the method comprises the steps of sequentially determining a second fusion live broadcast feature vector corresponding to a current time period from a non-first time period of a plurality of continuous time periods, fusing a first fusion live broadcast feature vector corresponding to the current time period and a long-short term memory network output memory live broadcast feature vector corresponding to a previous time period of the current time period aiming at a long-short term memory network corresponding to the current time period to obtain a second fusion live broadcast feature vector corresponding to the current time period, wherein when the first fusion live broadcast feature vector corresponding to the current time period and the memory live broadcast feature vector of the previous time period are fused, a target feature value to be fused can be fused according to a predetermined fusion weight to obtain the second fusion live broadcast feature vector corresponding to the current time period.

And then, judging whether the current time period is the last time period in a plurality of continuous time periods, and if the current time period is the last time period in the continuous time periods, directly taking the second fusion live broadcast feature vector corresponding to the current time period as the live broadcast feature vector corresponding to the live broadcast room to be detected.

When it is determined that the current time period is not the last time period in the continuous time periods, a memory live broadcast feature vector corresponding to the current time period, which is obtained after a weight is distributed to a feature value in a second fusion live broadcast feature vector corresponding to the current time period according to a preset importance degree, may be output to a long-term and short-term memory network corresponding to the next time period, and is used for determining the second fusion live broadcast feature vector corresponding to the next time period together with the first fusion live broadcast feature vector corresponding to the next time period, and then S403 is executed, that is, it is determined whether the next time period is the process of the last time period in the continuous multiple time periods.

In particular, the second fusion live broadcast feature vector corresponding to the first time period in the continuous multiple time periods can be obtained by fusing the first fusion live broadcast feature vector corresponding to the first time period and a preset memory live broadcast feature vector.

Specifically, before the first merged live broadcast feature vector corresponding to the current time period is input into the long-short term memory network corresponding to the current time period, the first merged live broadcast feature vector may be input into the full connection layer, and after the mapping process of the full connection layer, the first merged live broadcast feature vector may be input into the long-short term memory network corresponding to the current time period.

In the above, for the case of including one layer of long-short term memory network, when two layers of long-short term memory networks are included, the second fused live broadcast feature vector corresponding to each time period will be continuously input into the next layer of long-short term memory network corresponding to the time period, and for the convenience of distinguishing, the first layer of long-short term memory network corresponding to each time period may be referred to as the first long-short term memory network corresponding to the time period, the second layer of long-short term memory network corresponding to each time period may be referred to as the second long-short term memory network corresponding to the time period, and for the second long-short term memory network corresponding to each time period, the second fused live broadcast feature vector corresponding to the time period and the memory live broadcast feature vector output by the second long-short term memory network corresponding to the previous time period of the time period are fused to obtain the third fused live broadcast feature vector corresponding to the time period to be detected, and then judging whether the time period is the last time period in a plurality of continuous time periods, if so, taking a third fused live broadcast feature vector corresponding to the time period as a fused live broadcast feature vector of the live broadcast room to be detected, if not, determining a memory live broadcast feature vector of a second long-short term memory network corresponding to the time period based on the third fused live broadcast feature vector corresponding to the time period, and then fusing the memory live broadcast feature vector corresponding to the time period and a second fused live broadcast feature vector corresponding to the next time period of the time period to obtain a third fused live broadcast feature vector corresponding to the next time period.

In the embodiment of the disclosure, the first fused live broadcast feature vector corresponding to the current time period and the memory live broadcast feature vector corresponding to the last time period of the current time period are fused to finally obtain the second fused live broadcast feature vector corresponding to the last time period in a plurality of continuous time periods, and the live broadcast room to be detected can be continuously monitored by the method, so that the safety detection result corresponding to the live broadcast room to be detected can be more accurately obtained.

After the fused live broadcast feature vector corresponding to the live broadcast room to be detected is obtained, the security detection result corresponding to the live broadcast room to be detected can be determined based on the fused live broadcast feature vector, for example, the security detection result can be determined in the following manner:

inputting the fusion live broadcast feature vector of the to-be-detected live broadcast room into a recombination layer in a pre-trained neural network for recombination, inputting a recombined result into a full-link layer for mapping, processing a mapping result through a sigmoid activation function to obtain a score capable of representing a safety detection result of the to-be-detected live broadcast room, wherein the score is expressed through a numerical value between 0 and 1.

The whole process is described in an embodiment with reference to fig. 5:

in this embodiment, 512 live broadcast rooms to be detected are monitored simultaneously, 132 dimensionality live broadcast feature vectors corresponding to 10 time periods of the 512 live broadcast rooms to be detected are firstly acquired, then inputting the 132-dimensional live broadcast feature vectors corresponding to the 512 live broadcast rooms to be detected in 10 time periods into a neural network, after two layers of time dimension convolutional layers, obtaining 256 dimensionality first fusion live broadcast eigenvectors corresponding to the 512 live broadcast rooms to be detected in 10 time periods, then splitting information according to the time periods, namely, a first fusion live broadcast feature vector corresponding to each time segment of 512 live broadcast rooms to be detected is obtained, then after passing through two layers of long and short term memory networks, acquiring fusion live broadcast characteristic vectors corresponding to 512 live broadcast rooms to be detected respectively, and then, after the processing of the recombination layer, the full connection layer and the sigmoid function, obtaining scores of the characterization safety detection results corresponding to 512 direct broadcasting rooms to be detected.

The security detection result obtained by the embodiment of the present disclosure may be determined by a pre-trained neural network;

As shown in fig. 6, a training process of a neural network is described, specifically, the neural network is trained in the following manner, including S601 to S604:

s601, acquiring live broadcast feature vectors of each sample live broadcast room in each time period of a plurality of continuous time periods and under a plurality of preset security feature dimensions.

Here, the live broadcast feature vectors of each sample live broadcast room in the multiple continuous time periods under the multiple preset security feature dimensions are determined, and the method is similar to the above method for determining the live broadcast feature vectors of the to-be-detected live broadcast room in the multiple continuous time periods under the multiple preset security feature dimensions, and is not repeated here.

S602, determining a fusion live broadcast feature vector corresponding to each sample live broadcast room in a plurality of continuous time periods based on the live broadcast feature vector corresponding to each sample live broadcast room in each continuous time period.

Here, the manner of determining the fused live broadcast feature vector corresponding to each sample live broadcast room is similar to that described above, and details are not repeated here.

S603, predicting a safety detection result corresponding to the sample live broadcast room based on the fusion live broadcast feature vector corresponding to the sample live broadcast room.

Here, the manner of predicting the security detection result corresponding to the sample live broadcast room is similar to the manner of determining the security detection result corresponding to the live broadcast room to be detected, which is not described herein again.

S604, adjusting network parameter values in the neural network based on the predicted safety detection result corresponding to each sample live broadcast room and the actual safety detection result corresponding to the sample live broadcast room.

The actual safety detection result corresponding to each sample piece can be calculated in advance based on manual detection, so that a safety detection result corresponding to each sample live broadcast room obtained through prediction and an actual safety detection result marked in the sample live broadcast room can be obtained to obtain a loss function of a prediction result and an actual result, a network parameter value in the neural network is adjusted according to a loss numerical function corresponding to the loss function until the loss function value is smaller than a set threshold value or training times reach a set number, and then the neural network used for predicting the safety detection result corresponding to the live broadcast room to be detected can be obtained.

Further, how to specifically obtain the live broadcast feature vectors of each sample live broadcast room in each of the continuous multiple time periods under the preset multiple security feature dimensions is similar to how to specifically obtain the live broadcast feature vectors of the to-be-detected live broadcast room in each of the continuous multiple time periods under the preset multiple security feature dimensions, which is introduced above, and is not described herein again.

Further, how to determine the fused live broadcast feature vector corresponding to each sample live broadcast room in each of the continuous multiple time periods based on the live broadcast feature vector corresponding to each sample live broadcast room in each of the continuous multiple time periods is similar to the manner of determining the fused live broadcast feature vector corresponding to the to-be-detected live broadcast room in each of the continuous multiple time periods described above, that is, the fused live broadcast feature vectors all relate to a time dimension convolutional layer, a long-short term memory network and a full connection layer, in particular, each layer of the full connection layer may include a plurality of neurons, and when the full connection layer is trained, the full connection layer may be trained in combination with a dropout layer, for example, when the full connection layer transmits data, the activation value of a certain neuron of the full connection layer is stopped working with a certain probability p, so that the overfitting phenomenon of the full connection layer may be effectively reduced, so that the generalization of the full joint is stronger.

In the disclosed embodiment, by acquiring a sample live broadcast room in each of a plurality of consecutive time periods, live broadcast feature vectors under multiple preset security feature dimensions are obtained, for example, under each time period of continuous 10 time periods, live broadcast feature vectors of a sample live broadcast room under multiple security feature dimensions are obtained, therefore, the corresponding fusion live broadcast characteristic vector of the sample live broadcast room is determined through the live broadcast characteristic vector corresponding to each time slot in a plurality of continuous time slots of the sample live broadcast room, the corresponding safety detection result of the sample live broadcast room is predicted through the fusion live broadcast characteristic vector, then the neural network is adjusted according to the predicted safety detection result and the actual safety detection result of the sample live broadcast room, therefore, the neural network which has higher accuracy and is used for carrying out security detection on the live broadcast room to be detected is obtained.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same technical concept, a live broadcast monitoring device corresponding to the live broadcast monitoring method is further provided in the embodiment of the present disclosure, and as the principle of solving the problem of the device in the embodiment of the present disclosure is similar to that of the live broadcast monitoring method in the embodiment of the present disclosure, reference may be made to the implementation of the method for implementing the device, and repeated details are not described again.

Referring to fig. 7, which is a schematic diagram of a live broadcast monitoring apparatus 700 according to an embodiment of the present disclosure, the live broadcast monitoring apparatus 700 includes: an acquisition module 701, a first determination module 702, a second determination module 703 and a control module 704.

The acquisition module 701 is configured to acquire live broadcast feature vectors of a to-be-detected live broadcast room in multiple preset security feature dimensions in each of multiple continuous time periods;

a first determining module 702, configured to determine, based on a live broadcast feature vector corresponding to each time period in a plurality of consecutive time periods of a to-be-detected live broadcast room, a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room;

a second determining module 703, configured to determine, based on the fused live broadcast feature vector, a security detection result corresponding to the to-be-detected live broadcast room;

and a control module 704, configured to perform live broadcast control based on the security detection result.

In a possible implementation manner, the obtaining module 701, when configured to obtain, in each of a plurality of consecutive time periods, live broadcast feature vectors in a preset plurality of security feature dimensions of a live broadcast room to be detected, includes:

acquiring live broadcast content of a to-be-detected live broadcast room in each of a plurality of continuous time periods, wherein the live broadcast content comprises live broadcast pictures and/or live broadcast audio, and extracting live broadcast content characteristics corresponding to the live broadcast content under a plurality of preset live broadcast content safety characteristic dimensions;

and splicing the live broadcast content characteristics corresponding to each time period of the live broadcast room to be detected, and the historical behavior characteristics and the user attribute characteristics corresponding to the time period of the live broadcast room to be detected to obtain the live broadcast characteristic vector of the live broadcast room to be detected in the time period.

In a possible implementation manner, the first determining module 702, when configured to determine, based on a live broadcast feature vector corresponding to each of a plurality of consecutive time periods of a to-be-detected live broadcast room, a fused live broadcast feature vector corresponding to the to-be-detected live broadcast room, includes:

for each time segment in a plurality of continuous time segments, determining a plurality of continuous target time segments corresponding to the time segment, wherein the time segments are included in the plurality of continuous target time segments;

fusing live broadcast feature vectors respectively corresponding to a plurality of continuous target time periods corresponding to the time periods to obtain first fused live broadcast feature vectors corresponding to the time periods;

and determining the corresponding fusion live broadcast feature vector of the to-be-detected live broadcast room based on the corresponding first fusion live broadcast feature vector of the to-be-detected live broadcast room in each time period.

In a possible implementation manner, the first determining module 702, when configured to determine the fused live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the first fused live broadcast feature vector corresponding to each time period in the to-be-detected live broadcast room, includes:

starting from a non-first time period in a plurality of continuous time periods, fusing a first fused live broadcast feature vector corresponding to a current time period and a memory live broadcast feature vector corresponding to a previous time period of the current time period to obtain a second fused live broadcast feature vector corresponding to a current time period of a to-be-detected live broadcast room;

determining a memory live broadcast feature vector corresponding to the current time period based on a second fusion live broadcast feature vector corresponding to the current time period, and fusing the memory live broadcast feature vector corresponding to the current time period and a first fusion live broadcast feature vector corresponding to the next time period of the current time period to obtain a second fusion live broadcast feature vector corresponding to the next time period of a to-be-detected live broadcast room;

and judging whether the next time period is the last time period in a plurality of continuous time periods, if so, taking the second fusion live broadcast feature vector corresponding to the next time period as the fusion live broadcast feature vector corresponding to the live broadcast room to be detected, and if not, taking the next time period as the current time period, and executing the step of determining the second fusion live broadcast feature vector corresponding to the current time period in the live broadcast room to be detected.

In a possible implementation, the control module 704, when configured to perform live broadcast control based on the security detection result, includes:

In a possible implementation manner, the live broadcast monitoring apparatus further includes a network training module 705, where the network training module 705 is configured to train a neural network that determines the security detection result;

In one possible implementation, the network training module 705 is configured to train the neural network in the following manner:

acquiring a live broadcast feature vector of each sample live broadcast room in each time period of a plurality of continuous time periods and under a plurality of preset security feature dimensions;

determining a fusion live broadcast feature vector corresponding to each sample live broadcast room based on the live broadcast feature vector corresponding to each time slot in a plurality of continuous time slots;

predicting a safety detection result corresponding to the sample live broadcast room based on the fusion live broadcast feature vector corresponding to the sample live broadcast room;

and adjusting network parameter values in the neural network based on the predicted safety detection result corresponding to each sample live broadcast room and the actual safety detection result corresponding to the sample live broadcast room.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

Corresponding to the live broadcast monitoring method in fig. 1, an embodiment of the present disclosure further provides an electronic device 800, and as shown in fig. 8, a schematic structural diagram of the electronic device 800 provided in the embodiment of the present disclosure includes:

a processor 81, a memory 82, and a bus 83; the memory 82 is used for storing execution instructions and includes a memory 821 and an external memory 822; the memory 821 herein is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 81 and data exchanged with the external memory 822 such as a hard disk, and the processor 81 exchanges data with the external memory 822 through the memory 821, and when the electronic device 800 operates, the processor 81 communicates with the memory 82 through the bus 83, so that the processor 81 executes the following instructions: acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods; determining a fusion live broadcast feature vector corresponding to the live broadcast room to be detected based on the live broadcast feature vector corresponding to each time period in a plurality of continuous time periods of the live broadcast room to be detected; determining a safety detection result corresponding to a to-be-detected live broadcast room based on the fusion live broadcast feature vector; and carrying out live broadcast control based on the safety detection result.

The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the live broadcast monitoring method in the foregoing method embodiments are executed. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The computer program product of the live broadcast monitoring method provided in the embodiment of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the live broadcast monitoring method described in the above method embodiment, which may be referred to specifically in the above method embodiment, and are not described herein again.

The embodiments of the present disclosure also provide a computer program, which when executed by a processor implements any one of the methods of the foregoing embodiments. The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A live broadcast monitoring method is characterized by comprising the following steps:

acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods;

determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each time period in the continuous multiple time periods of the to-be-detected live broadcast room;

determining a safety detection result corresponding to the to-be-detected live broadcast room based on the fusion live broadcast feature vector;

and carrying out live broadcast control based on the safety detection result.

2. The live broadcast monitoring method according to claim 1, wherein the acquiring of the live broadcast feature vectors of the live broadcast room to be detected in each of the continuous multiple time periods under multiple preset security feature dimensions comprises:

acquiring live broadcast content of the live broadcast room to be detected in each of the continuous multiple time periods, wherein the live broadcast content comprises live broadcast pictures and/or live broadcast audio, and extracting live broadcast content characteristics corresponding to the live broadcast content under the preset security characteristic dimensions of multiple live broadcast contents;

3. The live broadcast monitoring method according to claim 1, wherein the determining a fused live broadcast feature vector corresponding to the live broadcast room to be detected based on the live broadcast feature vector corresponding to each of the plurality of continuous time periods in the live broadcast room to be detected comprises:

for each time segment in the continuous multiple time segments, determining a continuous multiple target time segments corresponding to the time segment, wherein the continuous multiple target time segments comprise the time segment;

4. The live broadcast monitoring method according to claim 3, wherein the determining a fused live broadcast feature vector corresponding to the live broadcast room to be detected based on a first fused live broadcast feature vector corresponding to each time period of the live broadcast room to be detected comprises:

from the non-first time period in the continuous multiple time periods, fusing a first fused live broadcast feature vector corresponding to the current time period and a memory live broadcast feature vector corresponding to the last time period of the current time period to obtain a second fused live broadcast feature vector corresponding to the current time period of the to-be-detected live broadcast room;

determining a memory live broadcast feature vector corresponding to the current time period based on a second fusion live broadcast feature vector corresponding to the current time period, and fusing the memory live broadcast feature vector corresponding to the current time period and a first fusion live broadcast feature vector corresponding to a next time period of the current time period to obtain a second fusion live broadcast feature vector corresponding to the next time period of the to-be-detected live broadcast room;

and judging whether the next time period is the last time period in the continuous multiple time periods, if so, taking a second fusion live broadcast feature vector corresponding to the next time period as a fusion live broadcast feature vector corresponding to the live broadcast room to be detected, and if not, taking the next time period as the current time period, and executing the step of determining the second fusion live broadcast feature vector corresponding to the current time period in the live broadcast room to be detected.

5. The live broadcast monitoring method according to claim 1, wherein performing live broadcast control based on the security detection result includes:

6. The live broadcast monitoring method according to any one of claims 1 to 5, wherein the safety detection result is determined by a pre-trained neural network;

7. The live broadcast monitoring method according to claim 6, wherein the neural network is trained in the following manner:

determining a fusion live broadcast feature vector corresponding to each sample live broadcast room based on the live broadcast feature vector corresponding to each time slot in the continuous multiple time slots;

and adjusting the network parameter values in the neural network based on the predicted safety detection result corresponding to each sample live broadcast room and the actual safety detection result corresponding to the sample live broadcast room.

8. A live broadcast monitoring device, comprising:

the acquisition module is used for acquiring live broadcast feature vectors of a to-be-detected live broadcast room in various preset safety feature dimensions in each time period of a plurality of continuous time periods;

the first determining module is used for determining a fusion live broadcast feature vector corresponding to the to-be-detected live broadcast room based on the live broadcast feature vector corresponding to each time period in the continuous multiple time periods of the to-be-detected live broadcast room;

the second determining module is used for determining a safety detection result corresponding to the to-be-detected live broadcast room based on the fusion live broadcast feature vector;

and the control module is used for carrying out live broadcast control based on the safety detection result.

9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the live monitoring method of any one of claims 1 to 7.

10. A computer-readable storage medium, having stored thereon a computer program for performing, when executed by a processor, the steps of a live monitoring method as claimed in any one of claims 1 to 7.