WO2022262719A1

WO2022262719A1 - Live streaming processing method and apparatus, storage medium, and electronic device

Info

Publication number: WO2022262719A1
Application number: PCT/CN2022/098645
Authority: WO
Inventors: 刘伟科; 郐洪楠; 韩卫召; 沈俊杰; 邵京平
Original assignee: 北京沃东天骏信息技术有限公司
Priority date: 2021-06-15
Filing date: 2022-06-14
Publication date: 2022-12-22
Also published as: CN113329260A; CN113329260B

Abstract

Disclosed in embodiments of the present application are a live streaming processing method and apparatus, a storage medium, and an electronic device. The method comprises: obtaining a live display object and a live data stream of a live streaming terminal; performing action recognition and object recognition on the live data stream to determine recognition results of an action and object in the live data stream; performing taboo match on the recognition results of the action and object and the live display object in a taboo database, wherein the taboo database comprises taboo behavior information of a plurality of live display objects; and in response to the action and object recognition results being successfully matched in the taboo database, sending a taboo prompt to the live streaming terminal.

Description

A live broadcast processing method, device, storage medium and electronic equipment

This application claims priority to a Chinese patent application with application number 202110662255.4 filed with the China Patent Office on June 15, 2021, the entire contents of which are incorporated herein by reference.

technical field

The embodiments of the present application relate to the field of live broadcast technologies, for example, to a live broadcast processing method, device, storage medium, and electronic equipment.

Background technique

With the development of Internet technology and the improvement of social civilization, the live broadcast industry is becoming more and more mature. As the form with the strongest interactive experience, live broadcast has been loved by more and more users.

There are at least the following technical problems in the related technologies: There are many customs, habits and taboos in countries all over the world. Due to the low entry threshold for live broadcasting, it cannot be guaranteed that the anchor can understand the customs and habits of the world, and the real-time nature of live broadcasting can easily cause adverse effects .

Contents of the invention

Embodiments of the present application provide a live broadcast processing method, device, storage medium, and electronic equipment, so as to realize the identification of taboo behaviors in the live broadcast process and prompt the host end.

In the first aspect, the embodiment of the present application provides a live broadcast processing method, including:

Obtain the live display object and live data stream of the live broadcast terminal;

Perform action recognition and object recognition on the live data stream, and determine the action-object recognition result in the live data stream;

Performing tabu matching in the taboo library with the recognition result of the action-object and the live display object, wherein the taboo library includes taboo behavior information of a plurality of live display objects;

In response to the recognition result of the action-object being successfully matched in the taboo database, a taboo prompt is sent to the live broadcast terminal.

In the second aspect, the embodiment of the present application also provides a live broadcast processing device, including:

The live data stream acquisition module is configured to acquire the live display objects and live data streams at the live end;

The video frame recognition module is configured to perform action recognition and object recognition on the live data stream, and determine the action-object recognition result in the live data stream;

The taboo matching module is configured to perform taboo matching on the recognition result of the action-object and the live display object in a tabu library, wherein the taboo library includes taboo behavior information of multiple live display objects;

The taboo prompting module is configured to send a taboo prompt to the live broadcast terminal if the recognition result of the action-object is successfully matched in the tabu library.

In the third aspect, the embodiment of the present application also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the program, it implements the The live broadcast processing method provided by any embodiment.

In a fourth aspect, the embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the live broadcast processing method provided in any embodiment of the present application is implemented.

Description of drawings

FIG. 1 is a schematic flowchart of a live broadcast processing method provided in Embodiment 1 of the present application;

FIG. 2 is a schematic diagram of a live broadcast scene provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a process of generating a live data stream at a live end provided by an embodiment of the present application;

FIG. 4 is a schematic flowchart of a live broadcast processing method provided in Embodiment 2 of the present application;

FIG. 5 is a schematic structural diagram of a live broadcast processing device provided in Embodiment 3 of the present application;

FIG. 6 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present application.

detailed description

The application will be described in detail below in conjunction with the accompanying drawings and embodiments.

Embodiment one

Figure 1 is a schematic flow chart of a live broadcast processing method provided in Embodiment 1 of the present application. This embodiment is applicable to real-time detection of taboo behavior during live broadcast during live broadcast. This method can be processed by the live broadcast provided by the embodiment of the present application The live broadcast processing device can be implemented by software and/or hardware, and the live broadcast processing device can be configured on an electronic computing device such as a server or a computer, including the following steps:

S110. Obtain a live display object and a live data stream of the live broadcast terminal.

S120. Perform action recognition and object recognition on the live data stream, and determine an action-object recognition result in the live data stream.

S130. Perform tabu matching on the action-object recognition result and the live display object in a tabu library, wherein the tabu library includes taboo behavior information of multiple live display objects.

S140. If the recognition result of the action-object is successfully matched in the taboo library, send a taboo reminder to the live broadcast terminal.

For example, refer to FIG. 2. FIG. 2 is a schematic diagram of a live broadcast scene provided by an embodiment of the present application. The live broadcast terminal and the client are respectively connected to the live broadcast platform. two different electronic devices; in other embodiments, the live broadcast end and the client end can also be the same electronic device, and different user permissions can be distinguished through different display interfaces, or by logging in to different applications Programs (Application, APP) are distinguished, or areas are identified by verifying different role identities. The live broadcast terminal and the client can be terminal devices such as mobile phones and tablet computers, which have different permissions. The live broadcast terminal is set to provide live data, and has the live background management authority and the live broadcast object setting authority. The client can watch the live broadcast, and only has the right to watch the live broadcast and the transaction right to the live broadcast objects in the live broadcast.

The live broadcast terminal needs to be registered. The live broadcast platform is equipped with a registration module, which receives the user's registration request and user information, registers the user's live broadcast identity, and has corresponding permissions after successful registration. When any device initiates a request to the live broadcast platform, it is judged whether the device is logged in. After the login is successful, the authority of the device is determined, that is, the device is determined to be one of the live broadcast terminal and the client. For the live broadcast end, you can set up a live broadcast, set the live broadcast object (such as the item or game to be introduced), generate the live data stream, and transmit the generated live data stream to the live broadcast platform in real time, so that the client can obtain the live data stream from the live broadcast platform.

Referring to FIG. 3 , FIG. 3 is a schematic diagram of a process of generating a live data stream by a live broadcast terminal provided by an embodiment of the present application. After completing the registration and login, the live broadcast terminal sets the live broadcast display object, where the live broadcast display object is the country that the live broadcast display is oriented to, such as country A, country B, etc., and any live broadcast terminal can set at least one live broadcast display object. The live broadcast end includes local live broadcast collection equipment, for example, live broadcast collection equipment includes but is not limited to cameras, microphones, mobile phones and other terminal equipment. The live broadcast application on the terminal device calls the local live broadcast collection device to collect audio and video data to form a live data stream, and sends the live data stream to the live broadcast platform (such as a live server) through the live broadcast application on the terminal device. The live broadcast platform conducts authority identification on the information of the live broadcast terminal. If the live broadcast terminal has the live broadcast authority, it will receive the live data stream and store it, and encapsulate an anchor service layer in the outer layer for external services, which is convenient for receiving the live data stream sent by the client. When requested, the live data stream is sent to the client for display.

In this embodiment, the live broadcast platform receives the live display objects sent by the live broadcast end. Different live display objects have different taboo behaviors. For example, when the live display objects include country C, the behavior of touching Buddha statues cannot occur in the live broadcast. When country D is included, the behavior of sending even-numbered flowers cannot appear in the live broadcast. By pre-setting live display objects, it is convenient to detect taboo behaviors in a targeted manner, so as to avoid taboo behaviors of live display objects in live video streams.

The taboo library is pre-set in the live broadcast platform, and the taboo library includes taboo behavior information in various countries. In some embodiments, the tabu database is stored based on a data structure of {country-action(act: thing)}, where country is country information, action is a taboo action, act is an action, and thing is an object. Exemplarily, the taboo library may include {country C-interact with Buddha statues (interaction-Buddha statues)}, {country B-display flowers (holding/holding-lotus)} and so on. The taboo behavior information in the taboo database can be obtained by means of internet search and encyclopedia query. At the same time, the taboo behavior information in the taboo database can be edited according to needs, such as adding countries and adding country-specific taboo behavior information, or , Add, modify and delete any taboo behavior information, etc. By pre-setting the taboo library, it is convenient to use the taboo behavior information in the taboo library to perform taboo detection on the live data stream of each live broadcast terminal, and provide taboo prompts for the live data stream of the live broadcast terminal, so as to avoid taboo behavior when the anchor does not understand the national conditions of each country Condition.

Taboo behavior information is formed by the combination of actions and objects. For any live data stream, action recognition and object recognition are performed, and taboo behavior information is matched based on the recognized action results and object results. At the same time, the taboo behavior information stored in the taboo library Including corresponding actions and objects, it is convenient for precise matching and improves the high-precision recognition of taboo behavior information.

In some optional embodiments, action recognition and object recognition may be performed on each video frame in the live data stream, or action recognition and object recognition may be performed on partial video frames in the live data stream. Not limited.

Optionally, performing action recognition and object recognition on the live data stream, and determining the action-object recognition result in the live data stream includes: determining the detected video frame in the live data stream based on a preset time interval; Perform motion recognition on the video frame to obtain a motion recognition result; perform object recognition on the video frame, the object recognition result, wherein the object recognition result includes object type and object attribute; based on the motion recognition result of the video frame and the object recognition result to obtain the current action-object recognition result of the live data stream. Wherein, the preset time interval can be 3s-5s, which can be set according to requirements. Each video frame in the live video stream is configured with a time stamp, and each video frame to be detected can be determined according to the time stamp and a preset time interval. The local video frames to be detected are determined by the preset time interval, so as to avoid the problem of excessive calculation caused by detecting the full video frames of the live data stream and affecting the quality of the live broadcast.

In some embodiments, motion recognition can be performed on video frames based on a preset motion recognition model, and the extracted video frames are input into the pre-trained motion recognition model as input information to obtain the motion type output by the motion recognition model, for example Specifically, the type of action may include but is not limited to walking, eating, smoking, holding, touching, etc. Optionally, the action recognition model may be a key point recognition model, and the key point recognition model is set to recognize the key point positions of the characters in the video frame, and determine the corresponding action type based on the combination of key point positions, wherein the key point positions may include Head, hands, legs, and multiple joint points for hands and legs. Optionally, the action recognition model can be a skeleton recognition model, and the skeleton recognition model is set to recognize the skeleton diagram of the person in the video frame, and determine the corresponding action type based on the positions of multiple bones in the skeleton diagram. Optionally, the action recognition model includes but is not limited to a neural network model, a boosted tree model, a classifier model, etc., which is not limited thereto.

The action recognition model can be trained based on the sample image and the action label of the sample image, wherein the sample image can be collected based on the required action classification, for example, it can be obtained by collecting specific actions of the target object at different angles through an image acquisition device such as a camera , images at different positions and different light intensities are used as sample images, and images obtained based on specific actions as search words in a search engine are used as sample images. Wherein, the specific action may at least include actions in taboo behavior information according to each country in the taboo library.

In some optional embodiments, object recognition can be performed on video frames based on a preset object recognition model, and the extracted video frames are input into the pre-trained object recognition model as input information to obtain the object recognition output by the object recognition model result. Wherein, the object recognition result may include object type and object attribute, and object attribute may include but not limited to object quantity, object color and object size, etc., wherein, object attribute may be based on taboo behavior information of each country in the taboo library Object attributes are determined. For example, the taboo behavior information corresponding to country D includes the behavior of sending even-numbered flowers. Correspondingly, the object attributes include the number of objects; the taboo behavior information corresponding to country A includes the behavior of sending green hats, that is, the corresponding , the object properties include the object color.

The object recognition model can be trained based on sample images and object labels of the sample images, wherein the sample images can be obtained through a search engine, and the objects in the sample images can at least include objects in the taboo behavior information of each category in the taboo library. Optionally, the object recognition model includes but is not limited to a neural network model, a boosted tree model, a classifier model, etc., which is not limited thereto.

For any video frame, action recognition and object recognition can be performed synchronously, that is, the video frame is synchronously input to the action recognition model and the object recognition model, and corresponding recognition results are obtained respectively, wherein the recognition results carry the time stamp of the video frame And the live identification of the live video stream to which it belongs, combine the action recognition results and object recognition results with the same live identification and timestamp to obtain the current action-object recognition result, avoiding the combination of recognition results of different video frames or different live video streams Actions that contraindicate the case of a mismatch.

Match the obtained action-object recognition result in the taboo library to determine whether the action-object recognition result belongs to a taboo behavior. Since there are different taboo behavior information in different countries, the recognition results of the same action-object have different matching results in different countries. For example, the behavior of sending even-numbered flowers is not a taboo behavior in country A, but it is taboo in country D Therefore, according to the live broadcast display objects sent by the live broadcast end, the action-object recognition results are targeted for taboo matching to improve the matching accuracy and avoid problems such as omissions or false reminders caused by wrong matching.

Optionally, performing tabu matching on the action-object recognition result and the live display object in the tabu library includes: determining a matching range in the tabu library based on the live display object, wherein the The matching range includes the taboo behavior information of the live broadcast display object; the recognition result of the action-object is matched within the matching range of the live broadcast display object. Exemplarily, the taboo behavior information of the live display object selected by the live broadcast terminal is extracted from the taboo database to form a matching range, which includes all the taboo behavior information of each live display object selected by the live broadcast terminal in the taboo database, which will be passed through The action-object recognition results obtained by video frame recognition are matched within the above matching range, and the matching pertinence and matching accuracy are improved on the basis of reducing the amount of matching data.

Match the actions, objects, and object attributes in the action-object recognition results with the taboo behavior information within the matching range. If the actions, objects, and object attributes are all matched successfully, it is determined that the video frame includes taboo behaviors. If the action If at least one of , object and object attributes does not match successfully, it is determined that taboo behavior is not included in the video frame. Exemplary, the recognition result of the action-object is holding in hand-flowers (three), and sending even-numbered flowers The attribute of the object in the behavior is inconsistent, which is not a taboo behavior. The action-object recognition result is holding flowers (two flowers), which is consistent with the action, object, and object attributes in the behavior of sending even-numbered flowers, which is a taboo behavior.

In this embodiment, by separately recognizing the object and the action, the recognition accuracy is improved, and at the same time, the object and the action are matched separately, which reduces the difficulty of behavior matching and improves the accuracy of behavior matching. When the action-object recognition result of any live video stream is successfully matched in the taboo library, a taboo reminder is sent to the live broadcast terminal corresponding to the live video stream, which is used to prompt the anchor to stop the taboo behavior and correct it.

Optionally, sending a taboo reminder to the live broadcast terminal includes: extracting the prompt content of the successfully matched taboo behavior information, and sending the prompt content to the live broadcast terminal, so that the live broadcast terminal displays the prompt content. In some optional embodiments, the taboo library may store prompt content containing taboo behavior information, and the prompt content may include descriptive information including taboo behavior and correct behavior corresponding to the taboo behavior. Optionally, the taboo library may be stored in a data structure of {country-action(act: thing)-taboo-right}, where taboo is the description information of the taboo action, and right is the correct action corresponding to the taboo action. Exemplary, {Country D-present flowers (send-flowers)-cannot be double (especially not 2)-must be singular}, {Country B-display flowers (take/hold-flowers)-cannot be lotus-other flowers can }, {Country C - Interact with Buddha Statue (Interaction-Buddha Statue) - Cannot touch Buddha Statue (especially the head) - Need to maintain respect for Buddha Statue}. When the recognition result of the action-object matches the act item and the thing item in the taboo behavior information, extract the taboo item and right item to form taboo prompt information, and send the taboo prompt information to the live broadcast terminal based on the live broadcast identifier of the live video stream , so that the live broadcast end will display the prompt content when it receives a taboo prompt message.

In some optional embodiments, the prompt content includes at least one of text, picture and video. Wherein, the correct behavior corresponding to the taboo behavior can be at least one form of text, picture and video. Among them, the content of the text prompt can be displayed on the interface of the live broadcast terminal based on text bullet screens and text pop-up windows, and the content of the picture prompt can be displayed on the interface of the live broadcast terminal in the form of a certain screen ratio, and the preset duration is displayed in suspension, and the video prompt The content can be to repeatedly play the picture-in-picture video N times with a certain shielding ratio on the interface of the live broadcast terminal. For example, a transparent mask layer can be added to the live broadcast layer of the live broadcast terminal, and the specified video can be played at the above ratio and position of the transparent mask layer.

The technical solution of this embodiment ensures the completeness and accuracy of the taboo behavior information by setting a taboo database, including the taboo behavior information of each country. Obtain the live display objects selected by the live broadcast terminal and determine the taboo matching range in the taboo library to ensure the pertinence and accuracy of taboo behavior matching. Perform action recognition and object recognition on the acquired live data stream, determine the action-object recognition result in the live data stream, and perform targeted taboo matching in the tabu library based on the action-object recognition result in the live broadcast display object, In order to detect the taboo behavior of the live display object in the live data stream, by recognizing the action and the object separately, the difficulty of behavior recognition and matching is reduced, and the recognition accuracy and matching accuracy are improved. When the matching is successful, a taboo reminder is sent to the live broadcast terminal, which avoids the problem that the anchor does not understand the taboo behavior information of other countries, and the problem of live taboo occurs, and improves the standardization of live broadcast for different display objects.

Embodiment two

Fig. 4 is a schematic flow chart of a live broadcast processing method provided in Embodiment 2 of the present application, which is refined on the basis of the above embodiment. Optionally, the recognition result of the action-object is in the taboo library After the matching is successful, the method further includes: determining whether the successfully matched taboo behavior information satisfies the judgment condition, and if the successfully matched taboo behavior information does not meet the judgment condition, recording the successfully matched taboo behavior information, and continuing to Perform action recognition and object recognition in the next video frame; when the successfully matched taboo behavior information satisfies the judgment condition, execute the step of sending a taboo prompt to the live broadcast terminal. Referring to Figure 4, the method includes:

S210. Obtain a live display object and a live data stream of the live broadcast terminal.

S220. Perform action recognition and object recognition on the live data stream, and determine an action-object recognition result in the live data stream.

S230. Perform tabu matching on the action-object recognition result and the live display object in a tabu library, wherein the tabu library includes taboo behavior information of multiple live display objects.

S240. If the recognition result of the action-object is successfully matched in the taboo library, determine whether the successfully matched taboo behavior information satisfies the judgment condition, and if the successfully matched taboo behavior information satisfies the judgment condition, execute In step S260, if the successfully matched taboo behavior information does not satisfy the judgment condition, step S250 is executed.

S250. Record the successfully matched taboo behavior information, return to step S220, and perform action recognition and object recognition on the next video frame.

S260. Send a taboo prompt to the live broadcast terminal.

In order to reduce misjudgments and avoid repeated reminders, a judgment condition is set for each taboo behavior information. The judgment condition is used to determine the degree of taboo behavior. Different taboo behavior information can correspond to different judgment conditions. If the judgment conditions are met , then the live broadcast terminal is triggered to send a taboo prompt to prompt the host to make corrections. When the judgment condition is not met, it can record the taboo behavior information that matches successfully, and perform conditional accumulation on the recorded taboo behavior information, and perform a conditional accumulation on the next video frame Identify and match taboo behavior information until the judgment condition is met. By setting the judgment conditions, no prompts will be given for the live video streams that do not meet the judgment conditions, so as to avoid frequent prompts causing interference to the live broadcast.

Optionally, the judging condition includes one or both of a duration condition and a frequency condition. For example, the judging condition may be that the cumulative duration of the successfully matched taboo behavior information exceeds the preset duration N, and/or, the successfully matched taboo behavior The cumulative frequency of information exceeds the preset frequency M, where N is a natural number greater than 0, and M is a positive integer greater than or equal to 1. By setting different judgment conditions for different taboo behavior information, we can make targeted judgments on different taboo behavior information, set strict judgment conditions for bad behaviors such as fingering, for example, the number of times is 1, and improve the degree of live broadcast civilization. Set loose judgment conditions for non-bad behaviors such as chickens to reduce the interference caused by frequent reminders to live broadcasts.

In this embodiment, a recognition list is formed according to an action-object recognition result of each detected video frame, and the recognition list may include a duration recognition list and/or a frequency recognition list. Among them, the duration recognition list can be the data structure of List[{action-startTimestamp-ts}], where action is a taboo behavior, startTimestamp is the trigger start time, ts is the total duration of the behavior, and List means that the data structure is an ordered array . When a live broadcast triggers taboo content for the first time, initialize the List array, and save a record data to the array at the same time, where the ts value is 0. For example, if the action is to present flowers, and the startTimestamp is the Unix time 16021313211, then a record will be saved as: {Dedicated Flowers-16021313211-0}. The video frame is detected based on the preset time interval (for example, every 3 seconds). If the taboo match is successful, a new record is generated and compared with the last record in the List array. The comparison content includes action and startTimestamp, For example, if the action remains unchanged and the startTimestamp interval is 3 seconds (plus or minus 300ms error is allowed), startTimestamp will be overwritten and ts data will be increased by 3. For example, the data will become {贵花-16021313214-3}, where startTimestamp increases by 3 and ts also increases Got 3. If the action remains the same and the startTimestamp interval is greater than 3 seconds when detecting again, the startTimestamp will be overwritten, but the ts data will not be modified. By covering the startTimestamp, update the start time of each taboo behavior triggered, which is convenient to identify whether the current taboo behavior trigger and the previous trigger taboo behavior are continuous triggers, if the startTimestamp of the current taboo behavior trigger is the previous trigger taboo behavior The sum of startTimestamp and ts data indicates a continuous behavior, and the duration can be superimposed. If the startTimestamp of the current taboo behavior is not the sum of the startTimestamp and ts data of the previous current taboo behavior, it indicates a discontinuous behavior. If the action changes when detecting again, a new record will be created and appended to the end of the ordered list.

The frequency identification list can be a List[{action-count}] data structure, where count is the cumulative number of times. When taboo content is triggered for the first time, the List structure is initialized and an action-count data is formed, such as {touch Buddha statue-1}. Every time taboo content is triggered, it is judged whether there is the same action data in the List, and if there is the same action data in the List In this case, its count+1, if there is no same action data in the List, add an act-count data.

By setting the duration recognition list and the frequency recognition list, record the situation that the taboo behavior has been triggered in the live data stream, so as to facilitate the judgment based on the judgment conditions. In some optional embodiments, determining whether the successfully matched taboo behavior information satisfies the judgment condition includes: updating the duration information and/or frequency information of the taboo behavior information based on the successfully matched taboo behavior information; Whether the duration information and/or frequency information of the taboo behavior information satisfies the judgment condition corresponding to the taboo behavior information. Exemplarily, the duration identification list and/or the frequency identification list may be updated based on the successfully matched taboo behavior information, and the current duration information and/or frequency information may be determined. Judgment conditions corresponding to successfully matched taboo behavior information are extracted from the taboo database, and current duration information and/or frequency information are compared with the extracted judgment conditions. Exemplarily, the judgment condition is that the accumulated duration is greater than 15s. If the current duration information is less than 15s, the judgment condition is not satisfied, and if the current duration information is greater than 15s, the judgment condition is satisfied.

In the technical solution provided in this embodiment, after the action-object recognition result is successfully matched in the taboo library, it is determined whether the successfully matched taboo behavior information meets the judgment condition, and when the judgment condition is met, the prompt to the live broadcast terminal is triggered. Live video streams that meet the judgment conditions will not be prompted to avoid frequent reminders causing interference to the live broadcast.

Embodiment three

FIG. 4 is a schematic structural diagram of a live broadcast processing device provided in Embodiment 3 of the present application. The live broadcast processing device may be configured in a host platform or a live broadcast server. The device includes:

The live data stream obtaining module 310 is configured to obtain the live display object and the live data stream at the live end;

The video frame recognition module 320 is configured to perform action recognition and object recognition on the live data stream, and determine the action-object recognition result in the live data stream;

The taboo matching module 330 is configured to perform taboo matching on the action-object recognition result and the live display object in a tabu library, wherein the taboo library includes taboo behavior information of multiple live display objects;

The taboo prompting module 340 is configured to send a taboo prompt to the live broadcast terminal if the recognition result of the action-object is successfully matched in the tabu library.

On the basis of the foregoing embodiments, the video frame identification module 320 is set to:

Determining video frames for detection in the live data stream based on a preset time interval;

Perform motion recognition on the video frame to obtain a motion recognition result;

Performing object recognition on the video frame, the object recognition result, wherein the object recognition result includes object type and object attribute;

The current action-object recognition result of the live data stream is obtained based on the action recognition result and the object recognition result of the video frame.

On the basis of the foregoing embodiments, the contraindication matching module 330 is set to:

Determining a matching range in the taboo library based on the live broadcast display object, wherein the matching range includes taboo behavior information of the live broadcast display object;

Matching the recognition result of the action-object within the matching range of the live display object.

On the basis of the foregoing embodiments, the device also includes:

The taboo judgment module is configured to determine whether the successfully matched taboo behavior information satisfies the judgment condition after the recognition result of the action-object is successfully matched in the taboo library, and in response to the successfully matched taboo behavior information not meeting the judgment condition, Record the successfully matched taboo behavior information, and continue to perform action recognition and object recognition on the next video frame; in response to the successfully matched taboo behavior information meeting the judgment condition, execute the step of sending a taboo prompt to the live broadcast terminal.

On the basis of the above embodiments, the judgment conditions include one or two of duration conditions and frequency conditions;

The taboo judgment module is set to:

The determining whether the matching taboo behavior information satisfies the judging conditions includes:

Based on the successfully matched taboo behavior information, update the duration information and/or frequency information of the taboo behavior information;

It is determined whether the updated duration information and/or frequency information satisfies the judging condition corresponding to the taboo behavior information.

On the basis of the foregoing embodiments, the contraindication prompting module 340 is set to:

Extracting the prompt content of the successfully matched taboo behavior information, and sending the prompt content to the live broadcast terminal, so that the live broadcast terminal displays the prompt content.

Based on the above embodiments, the prompt content includes at least one of text, picture and video.

The live broadcast processing device provided in the embodiment of the present application can execute the live broadcast processing method provided in any embodiment of the present application, and has corresponding functional modules for executing the method.

Embodiment Four

FIG. 6 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present application. FIG. 6 shows a block diagram of an electronic device 12 suitable for implementing embodiments of the present application. The electronic device 12 shown in FIG. 6 is only an example, and should not limit the functions and scope of use of the embodiment of the present application. Device 12 is typically an electronic device undertaking image classification functions.

As shown in FIG. 6, electronic device 12 takes the form of a general-purpose computing device. Components of the electronic device 12 may include, but are not limited to: at least one processor 16, a storage device 28, and a bus 18 connecting various system components (including the storage device 28 and the processor 16).

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures. For example, these architectures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (Micro Channel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.

Electronic device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 12 and include both volatile and nonvolatile media, removable and non-removable media.

Storage device 28 may include computer system readable media in the form of volatile memory, such as random access memory (Random Access Memory, RAM) 30 and/or cache memory 32 . The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in FIG. 6, a disk drive for reading and writing to a removable nonvolatile disk (such as a "floppy disk") may be provided, as well as a removable nonvolatile disk (such as a Compact Disc- Read Only Memory, CD-ROM), Digital Video Disc (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical media) CD-ROM drive. In these cases, each drive may be connected to bus 18 via at least one data medium interface. The storage device 28 may include at least one program product having a set (for example, at least one) of program modules configured to perform the functions of the various embodiments of the present application.

A program 36 having a set (at least one) of program modules 26, such as but not limited to an operating system, at least one application program, other program modules, and program data, may be stored, for example, in storage device 28, in which case Each or some combination of these may include implementations of gateway environments. Program modules 26 generally perform functions and/or methods in the embodiments described herein.

The electronic device 12 can also communicate with at least one external device 14 (such as a keyboard, a pointing device, a camera, a display 24, etc.), and also communicate with at least one device that enables a user to interact with the electronic device 12, and/or communicate with a device that enables the user to interact with the electronic device 12. Electronic device 12 is capable of communicating with any device (eg, network card, modem, etc.) that communicates with at least one other computing device. Such communication may occur through input/output (I/O) interface 22 . Moreover, the electronic device 12 can also communicate with at least one gateway (such as a local area network (Local Area Network, LAN), wide area network, Wide Area Network, WAN) and/or a public gateway, such as the Internet, through the gateway adapter 20. As shown, gateway adapter 20 communicates with other modules of electronic device 12 via bus 18 . It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, disk arrays (Redundant Arrays) of Independent Disks, RAID) systems, tape drives, and data backup storage systems.

The processor 16 executes various functional applications and data processing by running the programs stored in the storage device 28 , for example, realizing the live broadcast processing method provided by the above-mentioned embodiments of the present application.

Embodiment five

Embodiment 5 of the present application provides a computer-readable storage medium, on which a computer program is stored. When the program is executed by a processor, the live broadcast processing method provided in the embodiment of the present application is implemented.

Of course, the computer-readable storage medium provided by the embodiment of the present application, the computer program stored thereon is not limited to the method operation described above, and can also execute the live broadcast processing method provided by any embodiment of the present application.

The computer storage medium in the embodiments of the present application may use any combination of at least one computer-readable medium. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections having at least one lead, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable Read-Only Memory ((Erasable Programmable Read-Only Memory, EPROM) or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above . In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a data signal carrying computer readable source code in baseband or as part of a carrier wave traveling as a data signal. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .

Source code embodied on a computer readable medium may be transmitted using any appropriate medium, including - but not limited to wireless, wire, optical cable, radio frequency (RF), etc., or any suitable combination of the foregoing.

Computer source code for carrying out the operations of this application may be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The Source Code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user computer through any kind of gateway, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (such as through an Internet Service Provider). Internet connection).

Claims

A live broadcast processing method, comprising:

Obtain the live display object and live data stream of the live broadcast terminal;

Perform action recognition and object recognition on the live data stream, and determine the action-object recognition result in the live data stream;

Performing tabu matching in the taboo library with the recognition result of the action-object and the live display object, wherein the taboo library includes taboo behavior information of a plurality of live display objects;

In response to the recognition result of the action-object being successfully matched in the taboo database, a taboo prompt is sent to the live broadcast terminal.
The method according to claim 1, wherein the performing action recognition and object recognition on the live data stream, and determining the action-object recognition result in the live data stream include:

Determining video frames for detection in the live data stream based on a preset time interval;

Perform motion recognition on the video frame to obtain a motion recognition result;

Performing object recognition on the video frame, the object recognition result, wherein the object recognition result includes object type and object attribute;

The current action-object recognition result of the live data stream is obtained based on the action recognition result and the object recognition result of the video frame.
The method according to claim 1, wherein, performing tabu matching in the tabu library of the recognition result of the action-object and the live display object includes:

Determining a matching range in the taboo library based on the live broadcast display object, wherein the matching range includes taboo behavior information of the live broadcast display object;

Matching the recognition result of the action-object within the matching range of the live display object.
According to the method according to claim 1, after the recognition result of the action-object is successfully matched in the tabu library, the method further comprises:

Determine whether the successfully matched taboo behavior information satisfies the judgment condition, and record the successfully matched taboo behavior information in response to the successfully matched taboo behavior information not meeting the judgment condition, and continue to perform action recognition and object recognition on the next video frame;

In response to the successfully matched taboo behavior information meeting the judgment condition, a taboo reminder is sent to the live broadcast terminal.
The method according to claim 4, wherein the judgment condition includes one or both of a duration condition and a frequency condition;

The determining whether the matching taboo behavior information satisfies the judging conditions includes:

Updating at least one of duration information and frequency information of the taboo behavior information based on the successfully matched taboo behavior information;

It is determined whether at least one of the updated duration information and frequency information satisfies the judging condition corresponding to the taboo behavior information.
The method according to claim 1, wherein the sending a contraindication prompt to the live broadcast terminal comprises:

Extracting the prompt content of the successfully matched taboo behavior information, and sending the prompt content to the live broadcast terminal, so that the live broadcast terminal displays the prompt content.
The method according to claim 6, wherein the prompt content includes at least one of text, picture and video.
A live broadcast processing device, comprising:

The live data stream acquisition module is configured to acquire the live display objects and live data streams at the live end;

The video frame recognition module is configured to perform action recognition and object recognition on the live data stream, and determine the action-object recognition result in the live data stream;

The taboo matching module is configured to perform taboo matching on the recognition result of the action-object and the live display object in a tabu library, wherein the taboo library includes taboo behavior information of multiple live display objects;

The taboo prompting module is configured to send a taboo prompt to the live broadcast terminal if the recognition result of the action-object is successfully matched in the tabu database.
An electronic device, comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the computer program, the live broadcast according to any one of claims 1-7 is realized Approach.
A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the live broadcast processing method according to any one of claims 1-7 is implemented.