CN114501060A

CN114501060A - Live broadcast background switching method and device, storage medium and electronic equipment

Info

Publication number: CN114501060A
Application number: CN202210080646.XA
Authority: CN
Inventors: 岑德炼; 蔡海军
Original assignee: Guangzhou Fanxing Huyu IT Co Ltd
Current assignee: Guangzhou Fanxing Huyu IT Co Ltd
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-05-13

Abstract

The invention discloses a live broadcast background switching method and device, a storage medium and electronic equipment. Wherein, the method comprises the following steps: under the condition that a target object in the live broadcast data stream is detected to be in a target behavior state currently, determining action characteristics of a target behavior action of the target object, audio characteristics of the live broadcast data stream and object characteristics of the target object, wherein the target behavior action is a behavior action of the target object in the target behavior state; calculating a target background feature by using the action feature, the audio feature and the object feature, wherein the target background feature is used for indicating a live background matched with a target behavior state of a target object; and under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database, switching the live broadcast background into the target live broadcast background. The invention solves the technical problem of poor live broadcast effect caused by the fact that the live broadcast background can not dynamically change.

Description

Live broadcast background switching method and device, storage medium and electronic equipment

Technical Field

The invention relates to the field of image processing, in particular to a live broadcast background switching method and device, a storage medium and electronic equipment.

Background

At present, a background image of a live broadcast room is mainly set by a main broadcast, and can also be a background image used by the main broadcast by replacing the main broadcast with a virtual background through portrait cutout when a server pushes streams.

However, both the background image set by the anchor owner and the virtual background used by the server for replacement are static two-dimensional images, and cannot automatically change along with changes of the anchor live broadcast process, such as changes of actions and styles. Therefore, when the anchor changes in the live broadcasting process, the live broadcasting background is not matched with the anchor due to the fact that the background cannot be dynamically changed, and the live broadcasting effect is poor.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a live background switching method and device, a storage medium and electronic equipment, and aims to at least solve the technical problem of poor live effect caused by the fact that a live background cannot be dynamically changed.

According to an aspect of the embodiments of the present invention, a method for switching a live broadcast background is provided, including: under the condition that a target object in a live data stream is detected to be in a target behavior state currently, determining action characteristics of a target behavior action of the target object, audio characteristics of the live data stream and object characteristics of the target object, wherein the target behavior action is the behavior action of the target object in the target behavior state; calculating a target background feature using the motion feature, the audio feature, and the object feature, wherein the target background feature is used to indicate a live background matching the target behavior state of the target object; and under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database, switching the live broadcast background into the target live broadcast background.

According to another aspect of the embodiments of the present invention, there is also provided a device for switching live broadcast backgrounds, including: a determining unit, configured to determine, when it is detected that a target object in a live data stream is currently in a target behavior state, an action feature of a target behavior action of the target object, an audio feature of the live data stream, and an object feature of the target object, where the target behavior action is a behavior action in which the target object is in the target behavior state; a calculating unit, configured to calculate a target background feature using the motion feature, the audio feature, and the object feature, where the target background feature indicates a live background matching the target behavior state of the target object; and the switching unit is used for switching the live broadcast background into the target live broadcast background under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database.

According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above method for switching live backgrounds when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the above method for switching live backgrounds through the computer program.

In the embodiment of the invention, a mode of using action characteristics of target behavior action, audio characteristics of live data stream and object characteristics of a target object to calculate to obtain target background characteristics under the condition of detecting that the target object in the live data stream is in a target behavior state, determining a target live background matched with the target background characteristics from a background database so as to switch the live background into the target live background is adopted, and by detecting the behavior state of the target object in the live data stream, when the target object is in the target behavior state, determining the currently matched target live background based on the action characteristics, the audio characteristics and the object characteristics and switching the background, the aim of dynamically determining the matched live background based on the action characteristics, the live audio characteristics and the object characteristics under the target behavior state is achieved, so that the technical effect of dynamically switching the live background based on the live background is realized, and then solved the live background can not dynamic change and the technical problem that the live effect is not good enough that leads to.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a schematic diagram of an application environment of an optional live background switching method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating an alternative method for switching live backgrounds according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating an alternative method for switching live backgrounds according to an embodiment of the present invention;

fig. 4 is a flowchart illustrating an alternative method for switching live backgrounds according to an embodiment of the present invention;

fig. 5 is a flowchart illustrating an alternative method for switching live backgrounds according to an embodiment of the present invention;

fig. 6 is a flowchart illustrating an alternative method for switching live backgrounds according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an alternative switching apparatus for live background according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an alternative electronic device according to an embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, there is provided a method for switching a live background, which may be, but is not limited to, applied in an environment as shown in fig. 1. The push streaming terminal 100 is a generating end of the live data stream, the watching terminal 130 is a watching end requesting to watch the live data stream, and the push streaming terminal 100 sends the live data stream to the server 120 through the network 110, so that the server 120 pushes the live data stream to the watching terminal 130. The server 120 is not limited to making switching adjustments to the live background in the live data stream to match the live background to the behavioral state of the target object. The server 120 receives the original live streaming data sent by the stream pushing terminal 100 through the network 110, switches the live background in the original live streaming to obtain a target live streaming, and pushes the target live streaming to the watching terminal 130 through the network 110, so as to implement switching of the live background.

A database 122 and a processing engine 124 run in the server 120, the database 122 is used to store the received original live broadcast stream and a target live broadcast stream obtained by switching the live broadcast background, and the processing engine 124 is used to switch the live broadcast background of the original frame in the original live broadcast stream to obtain the target live broadcast stream including the target live broadcast background. The switching of the live background by the server 120 is not limited to by sequentially performing S102 to S106. S102, determining action characteristics, audio characteristics and object characteristics. Under the condition that it is detected that a target object in the live data stream is currently in a target behavior state, determining action characteristics of a target behavior action of the target object, audio characteristics of the live data stream and object characteristics of the target object, wherein the target behavior action is a behavior action of the target object in the target behavior state. And S104, calculating the target background characteristics. And calculating target background characteristics by utilizing the action characteristics, the audio characteristics and the object characteristics, wherein the target background characteristics are used for indicating the live background matched with the target behavior state of the target object. And S106, switching to the target live background. And under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database, switching the live broadcast background into the target live broadcast background.

Optionally, in this embodiment, the aforementioned stream pushing terminal 100 and the viewing terminal 130 may be terminal devices configured with target clients, and may include but are not limited to at least one of the following: mobile phones (such as Android phones, IOS phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target client is a client with a live broadcast function, and is not limited to an audio client, a video client, an instant messaging client, a browser client, an education client, and the like. The network 110 may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server 120 may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is merely an example, and this is not limited in this embodiment.

As an optional implementation manner, as shown in fig. 2, the method for switching the live background includes:

s202, under the condition that the target object in the live data stream is detected to be in the target behavior state at present, the action characteristic of the target behavior action of the target object, the audio characteristic of the live data stream and the object characteristic of the target object are determined.

In S202 described above, the target behavior action is a behavior action in which the target object is in the target behavior state. The target behavior action is not limited to a behavior action that detects that the target object is in the target behavior state, and the target behavior state is not limited to a specification indicating the behavior state of the target object and a type of the specification. Determining that the target object is in a free state, for example, in case no behavior specification exists for the behavior action of the target object; determining that the target object is in a gymnastic state under the condition that the behavior action of the target object conforms to the gymnastic behavior specification; and determining that the target object is in a dance state under the condition that the behavior action of the target object conforms to the dance behavior specification. The above behavior states are merely examples and are not intended to be limiting.

The action features of the target behavioral action are not limited to features that are further classified action types for the target behavioral action. And in the case that the target behavior action belongs to the behavior action of the target object in the target behavior state, further action type determination is carried out on the target behavior action, so that the action characteristics of the target behavior action are obtained.

The audio features are not limited to audio features that are audio included in the live data stream, and the audio features are not limited to a variety of parameters including audio type and audio. The audio is an important factor for atmosphere creation in the live broadcast process, and under the condition that the behavior action of the target object accords with a certain behavior specification, the behavior action is matched with the audio, so that the type of the behavior action is determined based on the assistance of the audio characteristics, the atmosphere in which the live broadcast is located is determined, and the switched live broadcast background is matched with the audio.

The object features are used to identify the target object and are not limited to including appearance features, decoration features, etc. of the target object. The appearance feature is not limited to including a physical feature, a hair style feature, and the ornamental feature is not limited to including a clothing feature, an ornamental feature. And extracting the object style of the target object through the object characteristics, so that the live background is matched with the target object in the target behavior state.

And S204, calculating the target background characteristics by using the motion characteristics, the audio characteristics and the object characteristics.

In the above S204, the target background feature is used to indicate a live background matching the target behavior state of the target object. The target background features are comprehensively determined according to the action features, the audio features and the object features, so that the target background features are matched with the target behavior actions of the target object, the audio in live broadcast and the target object. The target background feature is not limited to being calculated feature data.

The calculation of the target background feature by using the motion feature, the audio feature and the object feature is not limited to the calculation of the target background feature by using at least one of the motion feature, the audio feature and the object feature. The at least one feature includes any one of an action feature, an audio feature, an object feature, a combination of any two features, and a combination of three features. For example, the target background feature is calculated using the motion feature, the target background feature is calculated using the music feature and the object feature, the target background feature is calculated using the motion feature and the audio feature, and the target background feature is calculated using the motion feature and the audio feature and the object feature.

When the target background feature is calculated by using any one of the motion feature, the audio feature and the object feature, the calculated target background feature will also change when the selected feature changes. When the target background feature is calculated by using any two features of the motion feature, the audio feature and the object feature, if any one of the selected two features changes, the calculated target background feature may possibly change. Under the condition that the target background features are comprehensively determined based on the motion features, the audio features and the object features, under the condition that any one or more of the motion features, the audio features and the object features are changed, the calculated target background features are different.

The calculation of the target background feature is not limited to the calculation of the video frame group into which the live video stream is divided according to the preset data division condition in the live data stream. Target background features corresponding to different video frame groups in the live video stream are determined based on the action features, audio features and object features of the respective video frame groups.

And S206, under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database, switching the live broadcast background into the target live broadcast background.

The background database stores a plurality of live backgrounds which are not limited to background images and/or animations of a plurality of different styles and types generated in advance. The background animation is not limited to a dynamic background constructed by a plurality of background images. The dynamic background is not limited to being a dynamic background generated using a modeling method. After the live broadcast background is generated, the background characteristics of the live broadcast background are determined, and therefore the live broadcast background is stored in a background gallery according to the background characteristics.

The switching of the live broadcast background is not limited to switching the area image except the target object in the video frame in the live broadcast data stream to the background image corresponding to the target live broadcast background. The live background is switched to the target live background, the display of the target object in the original video frame is not influenced, and the behavior action of the target object is not changed and adjusted.

In the case of switching the live background to the target live background, the data stream including the target live background is not limited to be pushed to the viewing terminal, so that the live background formed on the viewing terminal changes along with the change of the behavior state of the target object.

When the target background features are the background features corresponding to the video frame groups, when the continuous video frame groups determine that the target objects are all in the target behavior state, but the determined background features are not consistent, the live broadcast background switching can be performed according to the background features determined by the background features, or when the target objects are still in the target behavior state, the background features are not calculated, and the target live broadcast background is used. And under the condition that the target object is in a target behavior state, the target live broadcast background determined according to the target background characteristics of the video frame group is subjected to background switching, so that more dynamic real-time switching of the live broadcast background is realized.

As an optional implementation manner, after switching the live background to the target live background, the method further includes: and under the condition that the target object in the live data stream is detected to be switched to the non-target behavior state, switching the target live background into an original live background, wherein the original live background is a live background carried in the live data stream.

Under the condition that the target object is determined to be in the non-target behavior state based on the live broadcast data stream, the server does not adjust the live broadcast background, and the original live broadcast background carried in the live broadcast data stream, namely the original live broadcast data stream, is used for live broadcast stream pushing.

In the embodiment of the application, a mode of using action characteristics of target action, audio characteristics of live data stream and object characteristics of a target object to calculate and obtain target background characteristics when detecting that the target object in the live data stream is in a target action state, determining a target live background matched with the target background characteristics from a background database so as to switch the live background into the target live background is adopted, and by detecting the action state of the target object in the live data stream, when the target object is in the target action state, determining the currently matched target live background based on the action characteristics, the audio characteristics and the object characteristics and switching the background, the purpose of dynamically determining the matched live background based on the action characteristics, the live audio characteristics and the object characteristics in the target action state and the target characteristics is achieved, so that the technical effect of dynamically switching the live background based on the live background is realized, and then solved the live background can not dynamic change and the technical problem that the live effect is not good enough that leads to.

As an alternative embodiment, as shown in fig. 3, before determining the action characteristic of the target behavior action of the target object, the audio characteristic of the live data stream, and the object characteristic of the target object, the method further includes:

s302, acquiring a current video frame group from a live data stream;

s304, performing behavior type recognition on the behavior action of the target object in the current video frame group by using a video recognition algorithm;

s306, under the condition that the identification result indicates that the behavior type of the behavior action of the target object belongs to the target behavior type, determining that the target object in the live broadcast data stream is currently in the target behavior state.

And determining the current video frame group from the data to be pushed of the live data stream. The video frame group is not limited to being a video sequence including a preset number of frames. The preset number of frames may be a preset number of frames or a preset duration of frames.

The video recognition algorithm is not limited to being a behavior action classification algorithm and is used for determining the behavior type of the target object in the continuous video frames. For example, the 10s Video is identified based on a Video identification algorithm (Temporal Shift Module for Efficient Video Understanding), and whether the anchor is in a dance state is determined.

As an optional implementation, determining the action characteristic of the target behavior action of the target object, the audio characteristic of the live data stream and the object characteristic of the target object includes:

s1, determining the target action type of the target action in the target action type according to the recognition result of the current video frame group by using a video recognition algorithm;

and S2, determining the feature vector of the target action type as the action feature.

The video recognition algorithm is not limited to also being used to identify a target action type for a group of video frames in a target behavior state. For example, a specific dance category is identified: national dance, jazz, Latin, ballet, street dance, modern dance.

When the target motion type is determined, the target motion type is encoded by using a feature vector, and the motion feature vector of the target dimension is formed to use the motion feature vector of the target dimension as the motion feature, without being limited to the preset encoding method.

s1, extracting the background audio features of the current video frame group, and determining the feature vector of the background audio features as the audio features, wherein the background audio features comprise at least one of the following: the type characteristic, the frequency characteristic and the loudness characteristic of background audio of the current video frame group;

s2, extracting the characteristics of the anchor object of the current video frame group, and determining the characteristic vector of the anchor object characteristics as the object characteristics, wherein the anchor object characteristics comprise at least one of the following: hair style features, clothing features, facial expression features of the anchor object.

The background audio features and the anchor object features are not limited to feature extraction of the content of the group of video frames. The background audio features are not limited to audio features including audio type, pitch frequency, audio loudness, audio detuning, audio sharpness, etc. of the background audio of the group of video frames. After the comprehensive background audio features are obtained, feature vector coding is carried out on the background audio features by utilizing a preset coding mode to form audio feature vectors of target dimensions, and therefore the audio feature vectors of the target dimensions are used as the audio features.

The anchor object feature is not limited to including a hair style feature, a clothing feature, a facial expression feature, etc. of the anchor object indicating the anchor object. After the comprehensive anchor object characteristics are obtained, the anchor object characteristics are subjected to characteristic vector coding by using a preset coding mode to form object characteristic vectors of target dimensions, and the object characteristic vectors of the target dimensions are used as audio characteristics.

The target dimension is used for indicating that the extracted action characteristic vector, the extracted audio characteristic vector and the extracted object characteristic vector are vectors with the same data volume, and the target background characteristic can be obtained through vector calculation. The feature vectors that the target dimension may indicate contain an amount of data, e.g., (1,1024) indicates that the feature vectors are each 1024 data constructed vector data. The above examples are not intended to limit the amount of data for the target dimension.

In the embodiment of the application, the action features are determined through video identification, the audio features and the object features are obtained through content extraction, and the action features, the audio features and the object features are subjected to digital feature vector coding, so that the calculation of the background features is facilitated.

As an alternative embodiment, as shown in fig. 4, the calculating the target background feature using the motion feature, the audio feature and the object feature includes:

s402, performing weighted calculation on the motion characteristics, the audio characteristics and the object characteristics according to characteristic weighted parameters corresponding to the motion characteristics, the audio characteristics and the object characteristics to obtain a target characteristic vector;

s404, converting the target feature vector into a target background feature in a target data format by using a target mapping function.

And performing weighted calculation on the motion characteristic, the audio characteristic and the object characteristic according to the weighted parameters to obtain a target characteristic vector. The respective weighting parameters of the motion feature, the audio feature, and the object feature are not limited herein, and it should be noted that the sum of the parameters of the weighting parameters of the motion feature, the audio feature, and the object feature is 1.

The target data format is not limited to a data format that is the background feature of the live background in the background database, so that the live background is searched in the background database by using the target background feature of the target data format. The target data format is not limited to being associated with a tag in the context database that represents the context. For example, 100 kinds of tags preset in the background database are set, and the target data format is not limited to (1,100), so that the target feature vector is mapped to the tag data, thereby performing a background search in the background database based on the tag data. The above examples are for illustration only and are not intended to be specific limitations on the target data format.

As an alternative implementation, as shown in fig. 5, switching the live background to the target live background includes:

s502, sequentially performing the following operations on each original frame in the current video frame group until target frames corresponding to all the original frames in the current video frame group are obtained:

s504, extracting an object area image including a target object in an original frame by using an image processing network;

s506, generating a target frame by using the target area image and the target live broadcast background.

In the case that the target live background of the video frame group is determined, the switching of the live background is not limited to replacing the live background of each video frame in the video frame group with the background. An object area image of a target object is extracted from an original frame included in the video frame group, thereby generating a target frame for replacing the original frame using the object area image and a target live background.

The image processing network for extracting the object region image is not limited to the lightweight MODNet network, and the MODNet network is used to perform portrait matting on the original frame to extract the object region image.

The switching flow of the live background is not limited to that shown in fig. 6. S602, determining a current video frame group from the live video stream. For example, a 10S video frame group is acquired. S604, carrying out video identification on the current video frame group, and determining whether the target object is in a target behavior state. For example, whether or not dance with the anchor is recognized. Under the condition that the anchor is determined to be in the dance state, continuing to execute the switching process; and under the condition that the main broadcast is determined not to be in the dance state, terminating the switching process, and directly utilizing the original live broadcast data stream to carry out live broadcast streaming.

And S606, determining the dance characteristics of the video, such as dance type characteristics, based on the video recognition result. And S608, extracting the attribute features of the live broadcast content. And extracting the content attribute characteristics of the video frame group to obtain audio characteristics and object characteristics. And S610, obtaining background characteristics through characteristic fusion, and determining a target live broadcast background. And under the condition that the background characteristics are obtained by performing characteristic fusion on the video dance characteristics, the audio characteristics and the object characteristics, searching the background based on the background generated by the background database to obtain the target live broadcast background.

And S612, image matting. And performing portrait matting on the video frames in the video frame group to obtain an area image where the anchor is located. And S614, switching the target live broadcast background. And generating a target frame by using the regional image and the target live broadcast background, and performing live broadcast stream pushing by using the target frame, thereby realizing the switching of the live broadcast background.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the present invention, there is also provided a switching apparatus for a live broadcast background, which is used for implementing the switching method for a live broadcast background. As shown in fig. 7, the apparatus includes:

a determining unit 702, configured to determine, when it is detected that a target object in a live data stream is currently in a target behavior state, an action feature of a target behavior action of the target object, an audio feature of the live data stream, and an object feature of the target object, where the target behavior action is a behavior action of the target object in the target behavior state;

a calculating unit 704, configured to calculate a target background feature using the action feature, the audio feature, and the object feature, where the target background feature is used to indicate a live background matching a target behavior state of the target object;

a switching unit 706, configured to switch the live broadcast background to a target live broadcast background when the target live broadcast background matched with the target background feature is determined from the background database.

Optionally, the apparatus for switching live broadcast backgrounds further includes an identification unit, configured to obtain a current video frame group from the live broadcast data stream before determining an action feature of a target behavior action of the target object, an audio feature of the live broadcast data stream, and an object feature of the target object; performing behavior type recognition on the behavior action of the target object in the current video frame group by using a video recognition algorithm; and under the condition that the identification result indicates that the behavior type of the behavior action of the target object belongs to the target behavior type, determining that the target object in the live broadcast data stream is currently in the target behavior state.

Optionally, the determining unit 702 includes:

the first determining module is used for determining a target action type of a target action in the target action types by utilizing the recognition result of the video recognition algorithm on the current video frame group; and determining the feature vector of the target action type as the action feature.

Optionally, the determining unit 702 includes:

a second determining module, configured to extract a background audio feature of the current video frame group, and determine a feature vector of the background audio feature as an audio feature, where the background audio feature includes at least one of: the type characteristic, the frequency characteristic and the loudness characteristic of background audio of the current video frame group;

a third determining module, configured to extract a anchor object feature of the current video frame group, and determine a feature vector of the anchor object feature as an object feature, where the anchor object feature includes at least one of: hair style features, clothing features, facial expression features of the anchor object.

Optionally, the calculating unit 704 is further configured to perform weighted calculation on the motion feature, the audio feature, and the object feature according to feature weighting parameters corresponding to the motion feature, the audio feature, and the object feature, respectively, to obtain a target feature vector; and converting the target feature vector into a target background feature in a target data format by using a target mapping function.

Optionally, the switching unit 706 includes:

the switching module is used for sequentially performing the following operations on each original frame in the current video frame group until target frames corresponding to all the original frames in the current video frame group are obtained: extracting an object area image including a target object in an original frame by using an image processing network; and generating a target frame by using the target area image and the target live broadcast background.

Optionally, the device for switching a live broadcast background further includes an updating unit, configured to switch a target live broadcast background to an original live broadcast background when it is detected that a target object in a live broadcast data stream is switched to a non-target behavior state after the live broadcast background is switched to the target live broadcast background, where the original live broadcast background is a live broadcast background carried in the live broadcast data stream.

In the embodiment of the application, a mode of using action characteristics of target behavior action, audio characteristics of live data stream and object characteristics of a target object to calculate to obtain target background characteristics when detecting that the target object in the live data stream is in a target behavior state, determining a target live background matched with the target background characteristics from a background database, and switching the live background into the target live background is adopted, by detecting the behavior state of the target object in the live data stream, when the target object is in the target behavior state, determining the currently matched target live background based on the action characteristics, the audio characteristics and the object characteristics and switching the background, the purpose of dynamically determining the matched live background based on the action characteristics, the live audio characteristics and the object characteristics in the target behavior state is achieved, and the technical effect of dynamically switching the live background based on the live background is realized, and then solved the live background can not dynamic change and the technical problem that the live effect is not good enough that leads to.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the above method for switching live broadcast backgrounds, where the electronic device may be a terminal device or a server shown in fig. 1. The present embodiment takes the electronic device as a server as an example for explanation. As shown in fig. 8, the electronic device comprises a memory 802 and a processor 804, the memory 802 having a computer program stored therein, the processor 804 being arranged to perform the steps of any of the above-described method embodiments by means of the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, under the condition that it is detected that a target object in the live broadcast data stream is currently in a target behavior state, determining the action characteristics of a target behavior action of the target object, the audio characteristics of the live broadcast data stream and the object characteristics of the target object, wherein the target behavior action is the behavior action of the target object in the target behavior state;

s2, calculating a target background feature by utilizing the action feature, the audio feature and the object feature, wherein the target background feature is used for indicating a live background matched with the target behavior state of the target object;

and S3, under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database, switching the live broadcast background into the target live broadcast background.

Alternatively, it may be understood by those skilled in the art that the structure shown in fig. 8 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an IOS phone, etc.), a tablet computer, a palmtop computer, and a Mobile Internet Device (MID), a PAD, and the like. Fig. 8 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 8, or have a different configuration than shown in FIG. 8.

The memory 802 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for switching a live broadcast background in the embodiments of the present invention, and the processor 804 executes various functional applications and data processing by running the software programs and modules stored in the memory 802, that is, implements the method for switching a live broadcast background. The memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 802 can further include memory located remotely from the processor 804, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 802 may be specifically, but not limited to, used for storing information such as motion features, audio features, object features, target background features, background database, target live background, and the like. As an example, as shown in fig. 8, the memory 802 may include, but is not limited to, a determination unit 702, a calculation unit 704, and a switching unit 706 in the switching device that includes the live background. In addition, the switching device may further include, but is not limited to, other module units in the switching device of the live broadcast background, which is not described in detail in this example.

Optionally, the transmission device 806 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 806 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 806 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In addition, the electronic device further includes: a display 808, configured to display the live data stream and a target live background; and a connection bus 810 for connecting the respective module parts in the above-described electronic apparatus.

In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations of the aspect of switching live contexts described above. Wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for switching live broadcast background is characterized by comprising the following steps:

under the condition that a target object in a live data stream is detected to be in a target behavior state currently, determining action characteristics of a target behavior action of the target object, audio characteristics of the live data stream and object characteristics of the target object, wherein the target behavior action is taken as a behavior action of the target object in the target behavior state;

calculating a target background feature using the action feature, the audio feature and the object feature, wherein the target background feature is used for indicating a live background matched with the target behavior state of the target object;

and under the condition that a target live broadcast background matched with the target background characteristics is determined from a background database, switching the live broadcast background into the target live broadcast background.

2. The method of claim 1, further comprising, prior to determining the action characteristic of the target behavioral action of the target object, the audio characteristic of the live data stream, and the object characteristic of the target object:

acquiring a current video frame group from the live data stream;

performing behavior type recognition on the behavior action of the target object in the current video frame group by using a video recognition algorithm;

and under the condition that the identification result indicates that the behavior type of the behavior action of the target object belongs to the target behavior type, determining that the target object in the live data stream is currently in the target behavior state.

3. The method of claim 2, wherein the determining an action characteristic of a target behavioral action of the target object, an audio characteristic of the live data stream, and an object characteristic of the target object comprises:

determining a target action type of the target action in the target action types according to the recognition result of the video recognition algorithm on the current video frame group;

determining the feature vector of the target action type as the action feature.

4. The method of claim 2, wherein the determining an action characteristic of a target behavioral action of the target object, an audio characteristic of the live data stream, and an object characteristic of the target object comprises:

extracting background audio features of the current video frame group, and determining feature vectors of the background audio features as the audio features, wherein the background audio features comprise at least one of the following: the type characteristic, the frequency characteristic and the loudness characteristic of background audio of the current video frame group; extracting anchor object features of the current video frame group, and determining feature vectors of the anchor object features as the object features, wherein the anchor object features include at least one of: hair style features, clothing features, facial expression features of the anchor object.

5. The method of claim 4, wherein the computing a target background feature using the motion feature, the audio feature, and the object feature comprises:

according to feature weighting parameters corresponding to the action features, the audio features and the object features, performing weighting calculation on the action features, the audio features and the object features to obtain target feature vectors;

and converting the target feature vector into the target background feature in a target data format by using a target mapping function.

6. The method of claim 2, wherein switching the live context to the target live context comprises:

sequentially performing the following operations on each original frame in the current video frame group until target frames corresponding to all the original frames in the current video frame group are obtained:

extracting an object area image including the target object in the original frame by using an image processing network;

and generating the target frame by using the object area image and the target live background.

7. The method of any one of claims 1 to 6, further comprising, after switching the live background to the target live background:

and under the condition that the target object in the live data stream is detected to be switched to a non-target behavior state, switching the target live background into an original live background, wherein the original live background is a live background carried in the live data stream.

8. A switching device for live backgrounds, comprising:

the determining unit is used for determining action characteristics of a target action of a target object, audio characteristics of a live data stream and object characteristics of the target object under the condition that the target object in a live data stream is detected to be in a target action state currently, wherein the target action is the action of the target object in the target action state;

a calculating unit, configured to calculate a target background feature using the action feature, the audio feature, and the object feature, where the target background feature is used to indicate a live background matching the target behavior state of the target object;

and the switching unit is used for switching the live broadcast background into the target live broadcast background under the condition that the target live broadcast background matched with the target background characteristics is determined from the background database.

9. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any one of claims 1 to 7.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.