CN112752116A

CN112752116A - Display method, device, terminal and storage medium of live video picture

Info

Publication number: CN112752116A
Application number: CN202011612821.2A
Authority: CN
Inventors: 曾冠东; 陈盛福
Original assignee: Guangzhou Fanxing Huyu IT Co Ltd
Current assignee: Guangzhou Fanxing Huyu IT Co Ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2021-05-04

Abstract

The application discloses a method, a device, a terminal and a storage medium for displaying live video pictures, wherein the method comprises the following steps: displaying a live video picture of a target live broadcast room, wherein the live video picture is a video image frame obtained by decoding a live video stream of the target live broadcast room; marking a character area in the live video picture in response to an adjustment trigger operation for the live video picture; adjusting the live video picture based on the position of the character area in the live video picture to obtain an adjusted live video picture; and displaying the adjusted live video picture. The method and the device enable audience users to perform custom adjustment on live video pictures in the live watching process, and improve the flexibility of live video picture display.

Description

Display method, device, terminal and storage medium of live video picture

Technical Field

The embodiment of the application relates to the technical field of computers and internet, in particular to a method, a device, a terminal and a storage medium for displaying live video pictures.

Background

The video live broadcast application is popular with a large number of users.

In the related art, a main broadcast client sends a recorded live video stream to a server, and then the server sends the live video stream to a viewer client, and the viewer client plays the live video stream.

Currently, the live video frames displayed by the audience client are determined by the live video stream sent by the server, and the flexibility is lacked.

Disclosure of Invention

The embodiment of the application provides a method, a device, a terminal and a storage medium for displaying live video pictures, so that audience users can perform custom adjustment on the live video pictures in the live watching process, and the flexibility of displaying the live video pictures is improved. The technical scheme is as follows:

according to an aspect of an embodiment of the present application, there is provided a method for displaying a live video frame, the method including:

displaying a live video picture of a target live broadcast room, wherein the live video picture is a video image frame obtained by decoding a live video stream of the target live broadcast room;

marking a character area in the live video picture in response to an adjustment triggering operation for the live video picture;

adjusting the live video picture based on the position of the character area in the live video picture to obtain an adjusted live video picture;

and displaying the adjusted live video picture.

According to an aspect of an embodiment of the present application, there is provided a display apparatus for a live video screen, the apparatus including:

the system comprises a picture display module, a video processing module and a video processing module, wherein the picture display module is used for displaying a live video picture of a target live broadcast room, and the live video picture refers to a video image frame obtained by decoding a live video stream of the target live broadcast room;

the character marking module is used for responding to the adjustment triggering operation aiming at the live video picture and marking the character area in the live video picture;

the picture adjusting module is used for adjusting the live video picture based on the position of the character area in the live video picture to obtain an adjusted live video picture;

and the picture display module is also used for displaying the adjusted live video picture.

According to an aspect of the embodiments of the present application, there is provided a terminal, the terminal includes a processor and a memory, the memory stores a computer program, and the computer program is loaded and executed by the processor to implement the above-mentioned display method of a live video picture.

According to an aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method of displaying a live video picture.

According to an aspect of the embodiments of the present application, there is provided a computer program product, which, when running on a terminal, causes the terminal to execute the above-mentioned display method of a live video picture.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

after the character area in the live video picture is determined, the live video picture is adjusted based on the position of the character area in the live video picture, so that audience users can perform custom adjustment on the live video picture in the live watching process, and the flexibility of live video picture display is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of an environment for implementing an embodiment provided by an embodiment of the present application;

fig. 2 is a flowchart of a method for displaying a live video frame according to an embodiment of the present application;

fig. 3 to 11 are schematic views of interfaces according to embodiments of the present application;

FIG. 12 is a flowchart of a person identification method provided in one embodiment of the present application;

FIG. 13 is a diagram illustrating foreground picture generation according to an embodiment of the present disclosure;

fig. 14 is a block diagram of a display device for live video frames according to an embodiment of the present application;

fig. 15 is a block diagram of a display device for live video frames according to another embodiment of the present application;

fig. 16 is a block diagram of a terminal according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Refer to fig. 1, which illustrates a schematic diagram of an environment for implementing an embodiment of the present application. The embodiment can be realized as a video live broadcast system. The embodiment implementation environment may include: anchor terminal 10, server 20 and viewer terminal 30.

A client of the live video application, which may be referred to as an anchor client, may be installed in the anchor terminal 10 for use by an anchor user. The anchor user may record a live video stream through the anchor client and then push it to the viewer users for viewing through the server 20. The number of anchor terminals 10 may be plural, for example, different anchor users may respectively conduct different live videos.

A client of the live video application, which may be referred to as a viewer client, may be installed in the viewer terminal 30 for use by a viewer user. The viewer user may receive the live video stream of the anchor user from server 20 via the viewer client and play the live video stream for viewing by the viewer user. The number of the audience terminals 30 may be plural, for example, different audience users may watch different live videos respectively.

The server 20 may be a background server of the above-mentioned live video application, and is used for providing a background service for the client. The server 20 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center. The server 20 establishes communication connections with the anchor terminal 10 and the viewer terminals 30, respectively, via a network.

The anchor terminal 10 and the viewer terminal 30 may be electronic devices such as a mobile phone, a tablet Computer, a multimedia player, a smart tv, a PC (Personal Computer), and the like. In addition, the anchor client and the audience client can be clients of two different versions of a video live broadcast application, for example, one version has a function of recording a live broadcast video stream, and the other version has a function of displaying the live broadcast video stream; the client may also be a client of the same version, and for example, the version has a function of recording the live video stream and a function of displaying the live video stream, which is not limited in this embodiment of the application.

Referring to fig. 2, a flowchart of a method for displaying a live video frame according to an embodiment of the present application is shown. The method can be applied to the viewer terminal 30 of the implementation environment shown in fig. 1, for example, the execution subject of each step can be a viewer client installed and operated in the viewer terminal 30. The method comprises the following steps (210-240):

step 210, displaying a live video picture of the target live broadcast room, where the live video picture refers to a video image frame obtained by decoding a live video stream of the target live broadcast room.

The target live broadcast room can be any one video live broadcast room, and the live broadcast video picture refers to a video picture obtained by shooting a live broadcast scene, namely a video image frame obtained by decoding a live broadcast video stream of the target live broadcast room. The picture content included in the live video picture is determined by the characters and objects included in the live scene. For example, a live scene includes one or more characters and objects such as a table and a microphone, and a video picture obtained by shooting the live scene may also include the one or more characters and the objects such as the table and the microphone. In the embodiment of the present application, the type of the live video is not limited, and may be singing, dancing, playing, lecturing, shopping, traveling, and the like.

Step 220, in response to the adjustment triggering operation for the live video picture, marking the character area in the live video picture.

One or more target objects may be included in a live video frame. For example, the target object may be a person, such as an anchor user. One or more people may be included in a live video frame, such as one or more anchor users. Of course, in some other embodiments, the target object may also be other picture content besides a person, such as a certain or certain specific object in a live video picture, such as an object like a table, a cup, etc., and may also be an animal like a cat, a dog, etc., which is not limited in this embodiment of the present application.

The adjustment trigger operation is an operation performed by the viewer user to trigger adjustment of the live video picture. In the embodiment of the present application, the operation form of the adjustment triggering operation is not limited, for example, the operation form may be a finger touch operation, a mouse click operation, or an operation form in a voice form, a gesture form, or the like. In one example, a user interface displaying a live video frame may include a specific control, and the user clicks the specific control through a finger or a mouse to perform the adjustment triggering operation.

And responding to the adjustment triggering operation aiming at the live video picture by the audience client, identifying the character area in the live video picture, and marking the identified character area. The character area refers to a local area of a live video picture containing characters. The person region may be an irregularly shaped region that coincides with or matches the outline of the person, or may be a regularly or irregularly shaped region that contains the person, such as a minimum rectangular box region that contains the person. Alternatively, the viewer client may mark the person region by displaying a mark frame corresponding to the person region, or displaying a mark arrow corresponding to the person region, or the like. In the embodiment of the present application, the manner of the mark is not particularly limited, and any manner may be used as long as the person region can be highlighted. In addition, please refer to the description in the following embodiments for the identification method of the human figure region.

Additionally, the user interface displayed by the viewer client may include a screen display layer and a control display layer. The display level of the control display layer is higher than that of the picture display layer, namely the control display layer is positioned above the picture display layer. The picture display layer is used for displaying live video pictures, and the control display layer is used for displaying operation controls for realizing man-machine interaction, such as a closing control, a chatting control, a gift sending control and the like. In addition, in the embodiment of the present application, a specific control for the viewer user to perform an adjustment triggering operation may also be included in the control display layer, for example, the specific control may be referred to as a character marking control.

Illustratively, as shown in fig. 3, taking a person as an example of a main user, a live video frame 30 and a person tagging control 31 positioned at the upper layer of the live video frame 30 are displayed in the user interface. The viewer user clicks on the personage marking control 31 and the viewer client, in response to the clicking operation, marks the anchor user 32 in the live video frame 30, such as by displaying a rectangular marking box 33 surrounding the anchor user 32.

It should be noted that the live video frame may include one person or may include a plurality of persons. Alternatively, where multiple people are identified from a live video frame, the multiple people may each be tagged. The audience user can select one or more characters from the characters according to actual requirements for subsequent processing.

Illustratively, as shown in fig. 4, still taking the character as the anchor user as an example, the live video frame 40 includes 3 anchor users, the viewer user clicks the character tagging control 41, and the viewer client responds to the clicking operation to tag the 3 anchor users in the live video frame 40 respectively, such as displaying 3 tagging frames, each of which is used for tagging one anchor user. Alternatively, as shown in fig. 4, the viewer user clicks the middle one of the mark boxes to trigger the selection operation for the target mark box 42, and the viewer client keeps displaying the target mark box 42 and cancels the other two mark boxes. And the anchor user in the target mark box 42 is the target anchor user selected by the viewer user.

Step 230, adjusting the live video picture based on the position of the character area in the live video picture to obtain an adjusted live video picture.

In the embodiment of the application, after the character area is determined, the live video picture is adjusted based on the position of the character area in the live video picture, so that a viewer user can perform custom adjustment on the live video picture in the live watching process, and the flexibility of live video picture display is improved.

In one example, a character region is extracted based on a position of the character region in a live video picture; and synthesizing the character area into a target background picture to obtain an adjusted live video picture. The target background picture may be a background picture selected by the user from a plurality of candidate background pictures, or may be a background picture determined by the viewer client, for example, a background picture determined by a random method or the like. In addition, the background content included in the target background picture may be some virtual scenes (such as virtual mountains, oceans, and other scenes), or may be obtained by shooting a real scene, which is not limited in the embodiment of the present application. Illustratively, as shown in fig. 5, the upper left is shown as an original live video frame 50, the background content of which is trees, and the lower left is shown as a target background picture 51, the background content of which is tall buildings, and an adjusted live video frame 53 is obtained by extracting a character region 52 from the original live video frame 50 and then combining the character region 52 into the target background picture 51. In the adjusted live video frame 53, the background content is the high-rise building, that is, the background content of the live video frame is changed. In addition, when the user selects or randomly selects a plurality of target background pictures, different target background pictures and character areas can be switched and used for synthesis according to a certain switching period, and an adjusted live video picture can be generated. For example, one target background picture is switched every 1 minute. The switching period may be set by a user in a self-defined manner, or may be set by a default of the client, which is not limited in the embodiment of the present application.

In another example, a display position of the bullet screen information is determined based on the position of the character area in the live video picture; the display position of the bullet screen information is not overlapped with the position of the character area; and adding bullet screen information in the live video picture based on the display position of the bullet screen information to obtain the adjusted live video picture. The barrage information refers to a barrage formed by chat messages, gift sending messages or other information sent by audience users in the process of watching live broadcast, and the barrage information can be displayed on the upper layer of a live video picture, for example, in a mode of moving from left to right. In the embodiment of the application, the display position of the bullet screen information and the position of the person region are set to be not overlapped, so that the bullet screen information can be prevented from shielding the person region (such as a main broadcasting user). For example, as shown in fig. 6, after the target anchor user 61 is determined from the live video frame 60, the bullet screen information 62 avoids the display position of the target anchor user 61 during the display process, so as to avoid shielding the target anchor user 61. In an exemplary embodiment, when the display position of the bullet screen information is controlled, the display position of the bullet screen information may be controlled not to overlap with the position of the face area in the person area, and may overlap with the positions of other areas except the face area in the person area, so as to ensure that the bullet screen information has a sufficient display space.

In another example, the person region is adjusted based on the position of the person region in the live video picture, so as to obtain an adjusted person region; and obtaining an adjusted live video picture based on the adjusted character area. Optionally, the adjusting process includes, but is not limited to, at least one of: zooming, stretching, position adjusting, moving out of a live video picture, replacing the live video picture with other display elements, copying and mirroring.

The zoom processing is to reduce or enlarge the display size of the human region. The stretching process is stretching in the lateral direction, the longitudinal direction, or other directions of the display region of the human body region. Illustratively, as shown in fig. 7, taking a zoom process as an example, the left side is shown as an original live video screen 70, and an adjusted live video screen 72 shown on the right side is obtained by performing a reduction process on a person area 71. When the human area is zoomed or stretched, the zoom ratio or the stretch ratio may be set by a user or may be set by a default at the client, which is not limited in the embodiments of the present application. In addition, for continuous multi-frame live video pictures, the scaling or stretching ratio corresponding to each frame may be the same or different, and under the condition that the scaling or stretching ratio corresponding to each frame is different, the live video pictures after multi-frame adjustment can present the visual effect of character jumping.

The position adjustment processing is to adjust the display position of the human region. Illustratively, as shown in fig. 8, the left side is shown as an original live video screen 80, and an adjusted live video screen 82 shown on the right side is obtained by adjusting the display position of the person region 81 in the live video screen. The position to which the character region is adjusted can be selected by the user through operation. In one example, a plurality of candidate locations are displayed in the user interface, from which the user selects one or more locations as locations to which the person region is adjusted. In another example, the user determines, by clicking or sliding, the end point position of the click position or the sliding operation as the position to which the human area is adjusted. In addition, for continuous multi-frame live video pictures, the positions of the character areas corresponding to the frames can be adjusted to be the same or different, and under the condition that the positions of the character areas corresponding to the frames are different, the live video pictures after multi-frame adjustment can present the visual effect of character jumping.

Moving out of the live video frame refers to removing the character area from the live video frame. Illustratively, as shown in fig. 9, the left side is shown as an original live video frame 90, which includes 3 person areas, and the viewer user clicks on a removal control 91 to remove the rightmost person area, leaving two remaining person areas as adjusted live video frames 92. It should be noted that, in the case that the live video picture includes a plurality of character regions, the user may move out all the character regions of the live video picture through one-step operation (for example, clicking one control or performing one-time sliding operation), or may move out a certain character region selected in the live video picture through one-step operation (for example, clicking one control or performing one-time sliding operation).

The replacement with another display element means that the human figure region is replaced with another display element such as an animal, a virtual human figure, or a virtual article, and the other display element may be selected and determined by the viewer user or determined by the viewer client. Illustratively, as shown in fig. 10, the left side is shown as an original live video screen 100, and an adjusted live video screen 102 is shown on the right side by replacing the character area 101 with a rabbit 103. In addition, for continuous multi-frame live video pictures, the replaced display elements corresponding to each frame can be the same or different, and under the condition that the replaced display elements corresponding to each frame are different, the live video pictures after multi-frame adjustment can present the visual effect of dynamic switching of the display elements.

The copy processing is to copy a person region, and for example, 2 identical person regions are copied and displayed in a live video screen. Illustratively, as shown in fig. 11, the left side is shown as an original live video screen 110, and an adjusted live video screen 112 shown on the right side is obtained by performing a copy process on a person area 111. In addition, the copy number may be set by a user, or determined by the viewer client, which is not limited in the embodiment of the present application. In an exemplary embodiment, when it is detected that the position of the human figure region changes in the two front and back live video frames, the copying may be automatically stopped and only the original human figure region may be displayed, or a plurality of copied human figure regions may be displayed in an overlapping manner, thereby avoiding an excessively cluttered screen display.

The mirroring process is to mirror the human object region to generate another human object region that is a mirror image of the original human object region, and may display the original human object region and the mirrored another human object region at the same time, may display only the mirrored another human object region without displaying the original human object region, or may alternately switch and display the original human object region and the mirrored another human object region at a certain switching frequency in the adjusted live video screen, which is not limited in the embodiment of the present application.

The embodiment of the application provides multiple modes for adjusting the live video picture, and in practical application, the audience client can only provide one mode for the audience user and can also provide multiple modes for the audience user to select. Under the condition of simultaneously providing a plurality of modes, the audience client can display options corresponding to the plurality of modes respectively, and the audience user selects a required adjusting mode through the options.

And 240, displaying the adjusted live video picture.

And after the adjusted live video picture is obtained, the audience client displays the adjusted live video picture. In addition, the spectator client can perform the above adjustment on the continuous multi-frame live video picture and display the live video picture after the corresponding adjustment. For example, after the adjustment triggering operation is executed by the viewer user, the adjustment of each frame of live video picture in the live video stream is started until the set time length is reached or the adjustment finishing operation of the viewer user is received, and the adjustment of the live video picture is stopped.

Because the position of the character area in the live video picture can be changed at any time, the adjustment is carried out on each frame of live video picture in continuous multi-frame live video pictures frame by frame, and the correspondingly adjusted live video picture is displayed, so that the display rationality of the adjusted live video picture can be ensured, and the problem of threading is avoided.

To sum up, the technical scheme that this application embodiment provided adjusts the live video picture based on the position of this personage region in the live video picture after confirming the personage region in the live video picture for audience user can carry out custom adjustment to the live video picture watching the in-process of living, promotes the flexibility that the live video picture shows.

In addition, the embodiment of the application provides multiple adjustment modes, such as changing the background, controlling the bullet screen, zooming the character area, removing the character area and the like, so that the scheme is more diversified, and the interactive experience of the user in the live broadcast watching process is improved.

In order to realize the adjustment function of the live video picture, it is necessary to identify a person region in the live video picture. The traditional character recognition method directly inputs live video pictures into a character recognition model, and the character recognition model outputs position information of character areas, so that the method is time-consuming and cannot meet the requirement of rapidly processing and displaying continuous multi-frame live video pictures. In the embodiment of the present application, an efficient and computationally inexpensive person identification method is provided, as shown in fig. 12, which may include the following steps:

step 1210, in response to the adjustment triggering operation for the live video picture, acquiring a live video picture corresponding to the live video picture.

The live video picture can be obtained by capturing a live video picture, or by directly extracting an image frame corresponding to the live video picture from a live video stream.

Step 1220, a background template picture corresponding to the live video picture is obtained.

In a live video picture, background content and foreground content are typically included. The foreground content includes characters (such as the anchor user), and the background content includes environments (such as a room where the anchor user is located) where the characters are located.

The background template picture refers to a picture including background contents but not including a person. Compared with the corresponding background template picture, the background content of the live video picture can be the same or similar, but the characters in the live video picture do not exist in the background template picture. Illustratively, as shown in fig. 13, the upper left diagram is a live video picture 131, which may be an image frame extracted from a live video stream, and the lower left diagram is a background template picture 132 corresponding to the live video picture 131.

In an exemplary embodiment, the background template picture is acquired by: under the condition that a live video picture meeting the condition is obtained, saving the live video picture meeting the condition as a background template picture; the live video picture meeting the condition is a live video picture without a character area. Under the condition that the background template picture is not obtained, the audience client can adopt a traditional character recognition method to carry out character recognition on the live video picture so as to determine whether a character area exists in the live video picture; and if the character area does not exist in the live video picture, saving the live video picture as a background template picture. Taking a character as an example of a main broadcast user, when the main broadcast user just starts broadcasting or when the main broadcast user leaves a live broadcast room, a live broadcast video picture corresponding to a live broadcast video picture usually only includes background content, but the main broadcast user does not exist, and then the live broadcast video picture can be stored as a background template picture.

In addition, if the character region exists in the live video picture, the character region can be removed from the live video picture to obtain a background template picture. For example, the pixel value of a corresponding pixel point of a person region in a live video picture is set as a fixed value, so that the person region is removed, and the obtained picture can be stored as a background template picture.

It should be noted that, considering that the background content of the live user may change during the live process, a new background template picture may be periodically obtained to update the background template picture, which is helpful to improve the robustness of character recognition.

Step 1230, a foreground picture is generated based on the live video picture and the background template picture.

After the live video picture and the corresponding background template picture are obtained, the foreground picture can be generated by performing subtraction on the two pictures.

In an exemplary embodiment, this step may include several sub-steps as follows:

1. registering the live video picture and the background template picture to obtain corresponding pixels in the live video picture and the background template picture;

the purpose of the registration processing is to align the same image content in the live video picture and the background template picture, and each group of corresponding pixels in the live video picture and the background template picture are pixels corresponding to the same image content in the two pictures.

In the embodiment of the present application, a method used for image registration is not limited, and for example, a corner-based registration method, a feature-based registration method, an image-feature-based registration method, and the like may be used. Noise caused by camera shaking can be eliminated through image registration, and accuracy of subsequent foreground extraction is improved.

2. Calculating difference values of corresponding pixels in the live video picture and the background template picture to obtain a difference value picture;

after the live video picture and the background template picture are registered, for each group of corresponding pixels, the pixel value of the pixel in the live video picture is adopted, the pixel value of the pixel in the background template picture is subtracted, and the obtained difference value is the pixel value of the pixel in the difference value picture.

3. And generating a foreground picture based on the difference picture.

And after obtaining the difference image, carrying out binarization processing on the difference image to obtain a foreground image. For example, a threshold value is preset, and for each pixel in the difference image, if the pixel value of the pixel is smaller than the threshold value, the pixel value of the pixel is set as a first numerical value; if the pixel value of the pixel is greater than the threshold value, the pixel value of the pixel is set to a second value. Optionally, the first value is 0 and the second value is 1, or the first value is 1 and the second value is 0.

In one example, a binary image obtained by binarizing the difference image may be directly used as a foreground image. In another example, after the difference image is binarized to obtain a binary image, the binary image is further corroded and expanded to eliminate the problems of adhesion and disconnection, and then the corroded and expanded image is used as a foreground image.

As shown in fig. 13, a foreground picture 133 is generated by subtracting a live video picture 131 and its corresponding background template picture 132.

Step 1240, performing character recognition processing on the foreground picture to obtain the position information of the character area, wherein the position information is used for indicating the position of the character area in the live video picture.

Optionally, performing character recognition processing on the foreground picture through a character recognition model to obtain position information of a character region; wherein the character recognition model is a machine learning model for recognizing a character region. For example, the character recognition model may be a model obtained by training a neural network using a machine learning technique.

In summary, according to the technical scheme provided by the embodiment of the application, the character recognition processing is performed on the foreground picture, compared with the conventional method that the character recognition processing is directly performed on the original live video picture, because the background content irrelevant to the character recognition is removed from the foreground picture, the interference and the influence of the background content on the image recognition can be eliminated, the calculated amount required by the image recognition is reduced, the recognition efficiency is improved, and thus the requirement of rapidly processing and displaying the continuous multi-frame live video picture is met.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Referring to fig. 14, a block diagram of a display device for live video frames according to an embodiment of the present application is shown. The apparatus 1400 has functions of implementing the above method embodiments, and the functions may be implemented by hardware, or by hardware executing corresponding software. The apparatus 1400 may be the user terminal described above, or may be configured in the user terminal. The apparatus 1400 may include: a screen display module 1410, a person tagging module 1420, and a screen adjustment module 1430.

The picture display module 1410 is configured to display a live video picture of a target live broadcast room, where the live video picture is a video image frame obtained by decoding a live video stream of the target live broadcast room.

A people tagging module 1420, configured to tag a people area in the live video picture in response to an adjustment trigger operation for the live video picture.

And the picture adjusting module 1430 is configured to adjust the live video picture based on the position of the character region in the live video picture to obtain an adjusted live video picture.

The picture display module 1410 is further configured to display the adjusted live video picture.

In an exemplary embodiment, the screen adjustment module 1430 is configured to:

extracting the character area based on the position of the character area in the live video picture;

and synthesizing the character area into a target background picture to obtain the adjusted live video picture.

In an exemplary embodiment, the screen adjustment module 1430 is configured to:

determining the display position of bullet screen information based on the position of the character area in the live video picture; the display position of the bullet screen information is not overlapped with the position of the character area;

and adding the bullet screen information into the live video picture based on the display position of the bullet screen information to obtain the adjusted live video picture.

In an exemplary embodiment, the screen adjustment module 1430 is configured to:

adjusting the character area based on the position of the character area in the live video picture to obtain an adjusted character area;

and obtaining the adjusted live video picture based on the adjusted character area.

Optionally, the adjusting process comprises at least one of: zooming, stretching, position adjusting, moving out the live video picture, replacing the live video picture with other display elements, copying and mirroring.

In an exemplary embodiment, as shown in fig. 15, the apparatus 1400 further includes: a picture acquisition module 1440, a template picture acquisition module 1450, a foreground picture generation module 1460, and a person identification module 1470.

The picture image obtaining module 1440 is configured to, in response to an adjustment trigger operation for the live video picture, obtain a live video picture image corresponding to the live video picture.

A template picture obtaining module 1450, configured to obtain a background template picture corresponding to the live video picture.

A foreground picture generating module 1460, configured to generate a foreground picture based on the live video picture and the background template picture.

A character recognition module 1470, configured to perform character recognition processing on the foreground picture to obtain position information of the character region, where the position information is used to indicate a position of the character region in the live video picture.

Optionally, the foreground picture generating module 1460 is configured to:

registering the live video picture and the background template picture to obtain corresponding pixels in the live video picture and the background template picture;

calculating difference values of corresponding pixels in the live video picture and the background template picture to obtain difference value pictures;

and generating the foreground picture based on the difference picture.

Optionally, the foreground picture generating module 1460 is specifically configured to:

carrying out binarization processing on the difference image to obtain a binary image;

and carrying out corrosion and expansion treatment on the binary image to obtain the foreground image.

Optionally, the person identification module 1470 is configured to perform person identification processing on the foreground picture through a person identification model to obtain location information of the person region; wherein the character recognition model is a machine learning model for recognizing the character region.

In an exemplary embodiment, as shown in fig. 15, the apparatus 1400 further includes: a template picture saving module 1480.

A template picture saving module 1480, configured to save the live video picture meeting the condition as the background template picture when a live video picture meeting the condition is acquired; the live video picture meeting the condition is the live video picture without the character area.

Referring to fig. 16, a block diagram of a terminal 1600 according to an embodiment of the present application is shown. The terminal 1600 may be an electronic device such as a mobile phone, a tablet computer, a multimedia player, a smart tv, a PC, etc. The client installed with the live video application in the terminal 1600 can be used to implement the display method of the live video picture. Specifically, the method comprises the following steps:

generally, terminal 1600 includes: a processor 1601, and a memory 1602.

Processor 1601 may include one or more processing cores, such as a 9-core processor, a 16-core processor, and so on. The processor 1601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field Programmable Gate Array), and a PLA (Programmable Logic Array). Processor 1601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also referred to as a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, the processor 1601 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1602 may include one or more computer-readable storage media, which may be non-transitory. The memory 1602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1602 is used to store a computer program configured to be executed by one or more processors to implement the above-described method of display of a live video picture.

In some embodiments, the terminal 1600 may also optionally include: peripheral interface 1603 and at least one peripheral. Processor 1601, memory 1602 and peripheral interface 1603 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 1603 via buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of a radio frequency circuit 1604, a touch screen display 1605, a camera assembly 1606, audio circuitry 1607, a positioning assembly 1608, and a power supply 1609.

Those skilled in the art will appreciate that the configuration shown in fig. 16 is not intended to be limiting of terminal 1600, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be employed.

In an exemplary embodiment, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method of displaying a live video picture.

Optionally, the computer-readable storage medium may include: ROM (Read-Only Memory), RAM (Random-Access Memory), SSD (Solid State drive), or optical disk. The Random Access Memory may include a ReRAM (resistive Random Access Memory) and a DRAM (Dynamic Random Access Memory).

In an exemplary embodiment, there is also provided a computer program product which, when run on a terminal, causes the terminal to execute the above-described method of displaying a live video picture.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for displaying a live video frame, the method comprising:

and displaying the adjusted live video picture.

2. The method of claim 1, wherein the adjusting the live video frame based on the position of the character region in the live video frame to obtain an adjusted live video frame comprises:

3. The method of claim 1, wherein the adjusting the live video frame based on the position of the character region in the live video frame to obtain an adjusted live video frame comprises:

4. The method of claim 1, wherein the adjusting the live video frame based on the position of the character region in the live video frame to obtain an adjusted live video frame comprises:

5. The method of claim 4, wherein the adjustment process comprises at least one of: zooming, stretching, position adjusting, moving out the live video picture, replacing the live video picture with other display elements, copying and mirroring.

6. The method according to any one of claims 1 to 5, further comprising:

responding to the adjustment triggering operation aiming at the live video picture, and acquiring a live video picture corresponding to the live video picture;

acquiring a background template picture corresponding to the live video picture;

generating a foreground picture based on the live video picture and the background template picture;

and carrying out figure identification processing on the foreground picture to obtain the position information of the figure region, wherein the position information is used for indicating the position of the figure region in the live video picture.

7. The method of claim 6, wherein generating a foreground picture based on the live video picture and the background template picture comprises:

and generating the foreground picture based on the difference picture.

8. The method of claim 7, wherein the generating the foreground picture based on the difference picture comprises:

9. The method of claim 6, wherein the performing the person identification process on the foreground picture to obtain the position information of the person region comprises:

carrying out figure identification processing on the foreground picture through a figure identification model to obtain the position information of the figure region;

wherein the character recognition model is a machine learning model for recognizing the character region.

10. The method of claim 6, further comprising:

under the condition that a live video picture meeting the condition is obtained, saving the live video picture meeting the condition as the background template picture;

the live video picture meeting the condition is the live video picture without the character area.

11. A display apparatus for live video frames, the apparatus comprising:

12. A terminal characterized in that it comprises a processor and a memory in which is stored a computer program that is loaded and executed by the processor to implement a method of displaying live video pictures according to any one of claims 1 to 10.

13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a method of displaying a live video picture according to any one of claims 1 to 10.