CN107770580B

CN107770580B - Video image processing method and device and terminal equipment

Info

Publication number: CN107770580B
Application number: CN201610694681.5A
Authority: CN
Inventors: 栾青
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2016-08-19
Filing date: 2016-08-19
Publication date: 2020-11-17
Anticipated expiration: 2036-08-19
Also published as: CN107770580A

Abstract

The embodiment of the invention discloses a video image processing method, a video image processing device and terminal equipment, wherein the method comprises the following steps: respectively drawing an information display object and at least one business object in a video in a computer drawing mode; acquiring accumulated information of trigger operation of the at least one service object in the video; and updating the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of the at least one business object. The method has the advantages that network resources can be effectively saved by drawing the service object in the video, the display condition of the current service object can be effectively reflected based on the information of the information display object, and the information of the information display object can guide a main broadcaster or an advertiser to correspondingly adjust the display of the service object.

Description

Video image processing method and device and terminal equipment

Technical Field

The embodiment of the invention relates to the technical field of data processing, in particular to a video image processing method, a video image processing device and terminal equipment.

Background

With the development of internet technology, the display mode of the advertisement is more and more diversified, and the traditional billboard mode is converted into the current internet display mode. The service provider carries the advertisement in the platform, the webpage and the video, for example, the advertisement is inserted in the video playing process, so that the user can effectively browse the advertisement while watching the video, and a better advertisement putting effect is achieved.

However, the cost of the advertisement delivered in the existing internet is often high, and a certain amount of resource waste is also caused, for example, a 60S advertisement video is inserted before a certain video starts to be played, and a 20S advertisement video is inserted in the playing process, so that the method not only occupies too many network resources in the transmission process, but also has high input cost, and has no corresponding advertisement display data statistics during the advertisement display, so that the display effect of the advertisement cannot be intuitively reflected, and the interactivity is not good.

Disclosure of Invention

The embodiment of the invention provides a video image processing technical scheme.

According to an aspect of the embodiments of the present invention, there is provided a video image processing method, including: respectively drawing an information display object and at least one business object in a video in a computer drawing mode; acquiring accumulated information of trigger operation of the at least one service object in the video; and updating the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of the at least one business object.

According to another aspect of the embodiments of the present invention, there is provided a video image processing apparatus including: the drawing module is used for respectively drawing the information display object and the at least one service object in the video in a computer drawing mode; the acquisition module is used for acquiring accumulated information of the trigger operation of the at least one service object in the video; and the updating module is used for updating the information of the information display object according to the triggering operation accumulated information so as to enable the updated information of the information display object to correspond to the triggering operation accumulated information of the at least one business object.

According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium storing: executable instructions for drawing the information presentation object and the at least one business object in the video separately in a computer drawing manner; executable instructions for obtaining trigger operation accumulation information for the at least one business object in the video; and the executable instruction is used for updating the information of the information display object according to the triggering operation accumulated information so as to enable the updated information of the information display object to correspond to the triggering operation accumulated information of the at least one business object.

According to another aspect of the embodiments of the present invention, there is also provided a terminal device, including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute any one of the video image processing methods.

The method comprises the steps of respectively drawing a business object and an information display object in a video in a computer drawing mode, acquiring trigger operation accumulated information of the business object in the video, and updating the information of the information display object according to the trigger operation accumulated information so as to enable the updated information of the information display object to correspond to the trigger operation accumulated information of the business object. In order to clearly reflect the display condition of the service object in the current live video, the embodiment of the invention can clearly reflect the display condition of the service object in the live video by setting the information display object, carry out cumulative statistics on the triggering operation of the service object to obtain corresponding cumulative information, and then determine the information of the information display object according to the cumulative information. The information based on the information display object can effectively reflect the display condition of the current business object, and the information of the information display object can guide the anchor or the advertiser to correspondingly adjust the display of the business object.

Drawings

Fig. 1 is a flowchart illustrating steps of a video image processing method according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating steps of a video image processing method according to a second embodiment of the present invention;

fig. 3 is a block diagram of a video image processing apparatus according to a third embodiment of the present invention;

fig. 4 is a block diagram of a video image processing apparatus according to a fourth embodiment of the present invention;

fig. 5 is a block diagram of a terminal device according to a fifth embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the invention is provided in conjunction with the accompanying drawings (like numerals indicate like elements throughout the several views) and examples. The following examples are intended to illustrate the examples of the present invention, but are not intended to limit the scope of the examples of the present invention.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present invention are used merely to distinguish one element, step, device, module, or the like from another element, and do not denote any particular technical or logical order therebetween.

Example one

Referring to fig. 1, a flowchart illustrating steps of a video image processing method according to an embodiment of the present invention is shown.

The embodiment of the present invention describes the video image processing method provided by the embodiment of the present invention by taking the video image processing during live broadcasting of the background server in the live broadcasting platform as an example, but is not limited to this, and the video image processing method provided by the embodiment of the present invention may also be applied to video image processing in a website or video image processing for display in video playing, and the like, and the present invention is not limited to this specifically.

The video image processing method of the embodiment may specifically include the following steps:

and 102, respectively drawing the information display object and at least one business object in the video by adopting a computer drawing mode.

When the anchor end carries out live broadcast through live broadcast application or live broadcast platform (such as live broadcast applications of bullfight, pepper, YY and the like), the display of the business objects and the display of corresponding information display objects can be accompanied, wherein the business objects include but are not limited to: special effects containing semantic information, such as special effects containing advertising information in the form of at least one of: two-dimensional sticker special effects, three-dimensional special effects, particle special effects, such as advertisements displayed using a sticker format (i.e., advertising stickers); or, the advertisement display device is used for displaying the advertisement special effect, such as a 3D advertisement special effect. But not limited thereto, other forms of business objects are also applicable to the video image processing scheme provided by the embodiment of the present invention, such as text descriptions or introductions of APP or other applications, or some form of object (e.g. electronic pet) interacting with the video audience. The information display object is used to indicate information of a trigger operation performed on a business object in the live video, such as viewing information reflecting a current business object, click information, and the like, and may be in a form including, but not limited to, a form of displaying coins.

And respectively drawing at least one business object and one information display object in the video at corresponding positions in the live video in a computer drawing mode according to the time relation corresponding to the video data frame and the special effect sequence frame. The positions of the business object and the information display object can be preset and can also be determined through a neural network.

And 104, acquiring accumulated information of the trigger operation of at least one service object in the video.

The platform of the embodiment of the invention can include but is not limited to a live broadcast platform, wherein a main broadcast carries out video live broadcast in a live broadcast room of the main broadcast, and when a business object starts to be displayed in a live broadcast program, the triggering operation is the operation of triggering the business object by a fan or the main broadcast, for example, the operation of accessing the current video live broadcast room is the operation of accessing the currently displayed business object, and for example, the operation of clicking the business object by the fan or the operation of triggering the business object by the main broadcast through behavior action. The three data acquisition methods may include various methods, such as acquiring, through a system interface, the number of visitors to a live broadcast room of a current live broadcast platform (until the service object is displayed), and acquiring, through an SDK (Software Development Kit), the cumulative number of clicks of the service object in a display time period. The present invention is not limited specifically to how to obtain the number of visitors of the live video, the number of trigger actions of the anchor, or the number of accumulated clicks of the service object.

In an optional embodiment of the present invention, the accumulated information of the trigger operations of the current service object or all service objects from the beginning of live broadcast may be counted; and accumulated information of trigger operation of a certain type of business objects in the live broadcasting process can be counted, such as all nike advertisements in the live broadcasting process.

And 106, updating the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of at least one business object.

Analyzing the acquired trigger operation accumulated information of the business object, and converting the trigger operation accumulated information into corresponding information reflecting the information display object operated by the business object based on a set conversion rule.

It should be noted that, according to the statistical criteria in step 104, the information of the corresponding information display object is determined, and if the triggering operation accumulated information of the current business object is counted in step 104, the information display object displays the information reflecting the current business object; if the trigger operation accumulated information of all the service objects started by the anchor from the beginning of live broadcast is counted in step 104, the information display object displays information reflecting all the service objects; if the accumulated information of the trigger operation of a certain type of service object in the live broadcasting process is counted in step 104, the information display object displays the information reflecting the service object.

And updating the information of the information display object into the determined information of the information display object, wherein after the information of the information display object is updated, the updated information corresponds to the current trigger operation accumulated information of the business object. That is, the updated information is the same as the current trigger operation accumulated information of the service object, that is, the current trigger operation accumulated information can be converted into the information of the new information display object according to the set conversion rule.

In the embodiment of the present invention, the information display object may include, but is not limited to: the revenue of the counted business object, such as the show currency (or advertisement currency, etc.), and the information of the show currency (or advertisement currency, etc.) can reflect the revenue of the business object during the showing.

The method and the device for displaying the information in the live video respectively draw the business object and the information display object in the video in a computer drawing mode by acquiring the information of the information display object in the live video, acquire the triggering operation accumulated information of the business object in the video, and update the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of the business object. The method and the device can effectively save network resources by drawing the service object in the video, can clearly reflect the display condition of the service object in the current live broadcast video by setting the information display object, can perform cumulative statistics on the triggering operation of the service object to obtain corresponding cumulative information, and then determine the information of the information display object according to the cumulative information. The information based on the information display object can effectively reflect the display condition of the current business object, and the information of the information display object can guide the anchor or the advertiser to correspondingly adjust the display of the business object.

Example two

Referring to fig. 2, a flowchart illustrating steps of a video image processing method according to a second embodiment of the present invention is shown.

The embodiment of the invention takes the video image processing during live broadcasting of a background server in a live broadcasting platform as an example to introduce and explain the video image processing method provided by the embodiment of the invention.

The video image processing provided by the embodiment of the invention specifically comprises the following steps:

step 202, determining the corresponding drawing position information of the information display object and the at least one service object in the video.

In the embodiment of the invention, when the anchor starts the live video in the live broadcast room of the anchor through the live broadcast platform or the live broadcast application, the drawing position information of at least one service object and an information display object is determined in the video image corresponding to the live video according to a set rule.

It should be noted that, in the embodiment of the present invention, at least one service object is displayed in a live video through a click operation of a main broadcast, and the present invention may perform statistics for a certain specific service object, for example, only statistics of accumulated information of trigger operations of a nike advertisement during a live broadcast, and also statistics of all service objects enabled by the main broadcast during the live video, and may be set according to actual needs, which is not specifically limited in the embodiment of the present invention.

For consistency of description, the present invention is described by taking as an example all the business objects enabled by the anchor during the live video, and the present embodiment can be referred to for the processing procedure of a specific business object.

In the embodiment of the present invention, the drawing position information of the service object and the information display object may be determined in at least the following two ways: determining feature points of a target object from a video, and determining drawing position information of a service object to be drawn in the video image by using a pre-trained convolution network model for determining the drawing position of the service object in the video image according to the feature points of the target object; determining the type of a target object from the video, and determining the type of the target object according to the characteristic points of the target object; determining drawing position information of a business object to be drawn according to the type of the target object; and determining the drawing position of the business object to be drawn in the video image according to the drawing position information.

The two modes are described in detail below.

In a first mode

When the drawing position of a business object to be drawn in a video image is determined in the first use mode, a convolution network model needs to be trained in advance, and the trained convolution network model has the function of determining the drawing position of the business object in the video image; alternatively, a convolution network model which is trained by a third party and has the function of determining the drawing position of the business object in the video image can be directly used.

It should be noted that, in this embodiment, the training of the target object part may be realized by referring to the related art with emphasis on the training of the business object, and this is only briefly described in the embodiment of the present invention.

When the convolutional network model needs to be trained in advance, one possible training method includes the following processes:

(1) and acquiring a characteristic vector of a business object sample image to be trained.

The feature vector includes information of a target object in the business object sample image, and position information and/or confidence information of the business object. Wherein the information of the target object indicates image information of the target object; the position information of the service object indicates the position of the service object, and the position information can be the position information of the center point of the service object or the position information of the area where the service object is located; the confidence information of the business object indicates the probability of the effect (such as being concerned or being clicked or being watched) that can be achieved when the business object is displayed at the current position, and the probability can be set according to the statistical analysis result of the historical data, the result of the simulation experiment, and the manual experience. In practical application, when a target object is trained, only the position information of the business object can be trained according to actual needs, only the confidence information of the business object can be trained, and both the position information and the confidence information can be trained. The position information and the confidence information of the business object can be more effectively and accurately determined by training both the convolutional network model and the service object, so that a basis is provided for displaying the business object.

The convolution network model is trained through a large number of sample images, and the business objects in the business object sample images in the embodiment of the invention can be labeled with position information, confidence information or both information in advance. Of course, in practical applications, the information may be obtained through other ways. And by marking the corresponding information of the business object in advance, the data and the interaction times of data processing can be effectively saved, and the data processing efficiency is improved.

And taking the business object sample image with the target object information and the position information and/or confidence degree information of the business object as a training sample, and extracting the feature vector of the training sample to obtain the feature vector containing the target object information and the position information and/or confidence degree information of the business object.

The feature vector may be extracted in an appropriate manner in the related art, and the embodiment of the present invention is not described herein again.

(2) And carrying out convolution processing on the feature vector to obtain a feature vector convolution result.

The obtained feature vector convolution result contains information of the target object, and position information and/or confidence information of the service object.

The convolution processing times of the feature vectors can be set according to actual needs, that is, in the convolution network model, the number of layers of the convolution layers is set according to actual needs, and the final feature vector convolution result meets the standard that the error is within a certain range (for example, 1/20-1/5 of the length or width of an image, and preferably, 1/10 of the length or width of the image).

The convolution result is the result of extracting the features of the feature vector, and the result can effectively represent the features and classification of each related object in the video image.

In the embodiment of the invention, when the feature vector contains both the position information and the confidence information of the business object, namely under the condition of training both the position information and the confidence information of the business object, the convolution result of the feature vector is shared when convergence condition judgment is carried out subsequently and respectively, repeated processing and calculation are not needed, the resource loss caused by data processing is reduced, and the data processing speed and efficiency are improved.

(3) And respectively judging whether the information of the corresponding target object in the feature vector convolution result and the position information and/or the confidence information of the service object meet the convergence condition.

Wherein, the convergence condition is set by those skilled in the art according to the actual requirement. When the information meets the convergence condition, the parameter setting in the convolution network model can be considered to be appropriate; when the information cannot satisfy the convergence condition, the parameter setting in the convolutional network model is considered to be improper, and the parameter setting needs to be adjusted, wherein the adjustment is an iterative process until the result of performing convolution processing on the feature vector by using the adjusted parameter satisfies the convergence condition.

In a feasible manner, the convergence condition may be set according to a preset standard position and/or a preset standard confidence, for example, whether a distance between a position indicated by the position information of the service object in the feature vector convolution result and the preset standard position satisfies a certain threshold is taken as the convergence condition of the position information of the service object; and whether the difference between the confidence coefficient indicated by the confidence coefficient information of the business object in the feature vector convolution result and the preset standard confidence coefficient meets a certain threshold value is used as a convergence condition of the confidence coefficient information of the business object, and the like.

Preferably, the preset standard position may be an average position obtained after averaging the positions of the business objects in the business object sample image to be trained; the preset standard confidence may be an average confidence obtained after performing average processing on the confidence of the business object in the sample image of the business object to be trained. And setting a standard position and/or a standard confidence coefficient according to the position and/or the confidence coefficient of the business object in the sample image of the business object to be trained, wherein the sample image is the sample to be trained and has huge data volume, so the set standard position and the set standard confidence coefficient are more objective and accurate.

When specifically determining whether the position information and/or the confidence information of the corresponding service object in the feature vector convolution result satisfies the convergence condition, a feasible method includes:

acquiring position information of a corresponding service object in the feature vector convolution result; calculating a first distance between a position indicated by the position information of the corresponding service object and a preset standard position by using a first loss function; judging whether the position information of the corresponding service object meets a convergence condition or not according to the first distance;

and/or the presence of a gas in the gas,

obtaining confidence information of a corresponding business object in the feature vector convolution result; calculating a second distance between the confidence coefficient indicated by the confidence coefficient information of the corresponding business object and the preset standard confidence coefficient by using a second loss function; and judging whether the confidence information of the corresponding business object meets the convergence condition or not according to the second distance.

In an optional implementation manner, the first loss function may be a function for calculating a euclidean distance between a position indicated by the position information of the corresponding service object and a preset standard position; and/or the second loss function may be a function for calculating a euclidean distance between the confidence degree indicated by the confidence degree information of the corresponding business object and a preset standard confidence degree. By adopting the Euclidean distance mode, the method is simple to realize and can effectively indicate whether the convergence condition is met. But are not limited to, other means, such as horse-type distances, barytems, etc., are equally applicable.

Preferably, as mentioned above, the preset standard position is an average position obtained after averaging the positions of the business objects in the business object sample image to be trained; and/or the preset standard confidence coefficient is an average confidence coefficient obtained after the average processing is carried out on the confidence coefficient of the business object in the sample image of the business object to be trained.

For the information of the target object in the feature vector convolution result, the judgment on whether the information of the target object is converged may be performed by referring to the relevant convergence condition using the convolution network model, which is not described herein again. If the information of the target object meets the convergence condition, the target object can be classified, the category of the target object is determined, and reference and basis are provided for determining the drawing position of the subsequent business object.

(4) If the convergence condition is met, finishing the training of the convolution network model; if the convergence condition is not met, adjusting the parameters of the convolution network model according to the feature vector convolution result, and performing iterative training on the convolution network model according to the adjusted parameters of the convolution network model until the feature vector convolution result after the iterative training meets the convergence condition.

By performing the above training on the convolutional network model, the convolutional network model can perform feature extraction and classification on the drawing position of the business object displayed based on the target object, thereby having a function of determining the drawing position of the business object in the video image. When the drawing positions comprise a plurality of drawing positions, the convolution network model can also determine the quality sequence of the display effect in the plurality of drawing positions through the training of the business object confidence coefficient, so that the optimal drawing position is determined. In subsequent application, when a business object needs to be displayed, an effective drawing position can be determined according to a current image in a video.

In addition, before the convolutional network model is trained, preprocessing may be performed on the business object sample image in advance, including: acquiring a plurality of business object sample images, wherein each business object sample image contains the labeling information of a business object; determining the position of the business object according to the labeling information, and judging whether the distance between the determined position of the business object and the preset position is smaller than or equal to a set threshold value or not; and determining the business object sample image corresponding to the business object smaller than or equal to the set threshold value as the business object sample image to be trained. The preset position and the set threshold may be appropriately set by a person skilled in the art in any appropriate manner, for example, according to a data statistical analysis result, a related distance calculation formula, or manual experience, and the like, which is not limited in this embodiment of the present invention.

In one possible approach, the position of the business object determined according to the annotation information may be a central position of the business object. When the position of the business object is determined according to the labeling information and whether the distance between the determined position of the business object and the preset position is smaller than or equal to a set threshold value is judged, the central position of the business object can be determined according to the labeling information; and then judging whether the variance between the center position and the preset position is less than or equal to a set threshold value.

By preprocessing the sample images of the business objects in advance, sample images which do not meet the conditions can be filtered out, so that the accuracy of the training result is ensured.

The training of the convolutional network model is realized through the process, and the trained convolutional network model can be used for determining the drawing position of the business object in the video image. For example, in the process of live video, if the anchor clicks the service object indication to display the service object, after the convolutional network model obtains the facial feature point of the anchor in the live video image, the optimal position for displaying the service object, such as the forehead position of the anchor, can be indicated, and then the mobile terminal controls the live broadcast application to display the service object at the position; or, in the process of live video, if the anchor clicks the service object indication to display the service object, the convolution network model can directly determine the drawing position (drawing position information) of the service object according to the live video image.

Mode two

In the second mode, firstly, the type of the target object is determined according to the characteristic points of the target object; determining drawing position information of the business object to be drawn according to the type of the target object; and then determining the drawing position of the business object to be drawn in the video image according to the drawing position information.

Wherein the types of the target object include, but are not limited to: face type, background type, hand type, and action type. The face type is used for indicating that a face occupies a main part in the video image, the background type is used for indicating that a background occupies a larger part in the video image, the hand type is used for indicating that a hand occupies a main part in the video image, and the action type is used for indicating that a person performs a certain action.

After the feature points of the target object are obtained, the type of the target object can be determined by adopting the existing related detection, classification or learning method. After the type of the target object is determined, drawing position information of the business object to be drawn can be determined according to a set rule, and the method comprises the following steps:

when the type of the target object is a face type, determining drawing position information of the business object to be drawn includes at least one of the following: a hair region, a forehead region, a cheek region, a chin region, and a body region other than the head of the person in the video image; and/or the presence of a gas in the gas,

when the type of the target object is a background type, determining drawing position information of the service object to be drawn includes: a background region in the video image; and/or the presence of a gas in the gas,

when the type of the target object is a hand type, determining drawing position information of the business object to be drawn comprises: the area in the set range, which takes the area where the hand is located in the video image as the center; and/or the presence of a gas in the gas,

when the type of the target object is the action type, determining drawing position information of the business object to be drawn comprises: a predetermined area in the video image.

The preset area in the video image may include: any region other than the person in the video image may be set as appropriate by a person skilled in the art according to actual conditions, for example, a region within a set range centered on the motion generation portion, a region within a set range other than the motion generation portion, or a background region, and the like, which is not limited in the embodiment of the present invention.

In an optional embodiment, the action corresponding to the action type includes at least one of: blinking, opening mouth, nodding head, shaking head, kissing, smiling, waving hand, scissor hand, fist making, holding hand, erecting thumb, swinging hand gun, swinging V-shaped hand, and swinging OK hand.

After the drawing position information is determined, the drawing position of the business object to be drawn in the video image can be further determined. For example, the center point of the drawing position information is taken as the center point of the drawing position of the business object to draw the business object; for another example, a certain coordinate position in the drawing position information is determined as a center point of the drawing position, and the like, which is not limited in this embodiment of the present invention. In the embodiment of the present invention, the preset area in the video image may include: a region of a person in the video image or any region other than the person in the video image. If the information display object is arranged at the positions of the upper left corner, the lower left corner, the upper right corner or the lower right corner of the live video interface, the information display object can be set to be movable, and the anchor can move the information display object to a proper position according to the needs of own live broadcasting. Therefore, in the embodiment of the present invention, the position information of the information display object is not particularly limited.

And step 204, respectively drawing the business object and the information display object in the video by adopting a computer drawing mode according to the drawing position information.

Before the drawing is carried out, the special effect sequence frames corresponding to the business object and the information display object are determined, and the business object and the information display object are respectively drawn in the video in a computer drawing mode based on the determined position information in the picture of the live video according to the time relation corresponding to the video data frames and the special effect sequence frames. Specifically, the information display object may be drawn in a computer drawing manner, for example, the information display object may be implemented in a suitable manner such as drawing or rendering a graphic image, including but not limited to: drawing based on an OpenGL graphics drawing engine, and the like. OpenGL defines a specialized graphical program interface with a cross-programming language, cross-platform programming interface specification, which is hardware-independent and can conveniently render 2D or 3D graphical images. By OpenGL, not only can 2D effects such as the drawing of 2D stickers be achieved, but also the drawing of 3D effects, the drawing of particle effects, and the like can be achieved.

Step 206, acquiring the cumulative number of clicks on the business object in the video within a period of time; and determining the accumulated information of the trigger operation of at least one business object in the video according to the accumulated times of clicking.

The platform of the embodiment of the invention can comprise but is not limited to a live broadcast platform, the anchor broadcast carries out video live broadcast in a live broadcast room of the anchor broadcast, and when at least one service object starts to be displayed in a live broadcast program, the accumulated times of clicking the service object are obtained through the SDK in a display time period.

In the embodiment of the invention, the business object has corresponding link information, and the corresponding interface is skipped according to the link information by clicking the business object by the fan user. The link information corresponds to the website of the business object treasure house, the official website and the like.

In an alternative scheme of the embodiment of the present invention, the SDK obtains the number of times of accessing the address corresponding to the link information through the platform, and determines the obtained number of times as the cumulative number of times of clicking the service object.

Wherein, the business object comprises: the video live broadcast comprises the video live broadcast in a live broadcast platform. The specific business objects include: special effects containing advertising information in at least one of the following forms: two-dimensional paster special effect, three-dimensional special effect and particle special effect. Such as advertisements displayed using sticker formats (i.e., advertising stickers); or, the advertisement display device is used for displaying the advertisement special effect, such as a 3D advertisement special effect. But not limited thereto, other forms of business objects are also applicable to the business statistics scheme provided by the embodiments of the present invention, such as text descriptions or presentations of APP or other applications, or some form of object (e.g., an electronic pet) interacting with a video audience. And determining the obtained click cumulative frequency of the business object as the trigger operation cumulative information of the business object in the display time period.

In an alternative scheme in the embodiment of the present invention, the trigger operation accumulation information of the business object may include not only the number of clicks accumulation of the business object, but also the number of current visitors of the video or the trigger action accumulation information of the anchor. Reference may be made specifically to step 208 and step 210.

Wherein, in the embodiment of the present invention, step 208 and step 210 can be both optional steps.

Step 208, acquiring the current number of visiting persons of the video; and determining the accumulated information of the triggering operation of at least one business object in the video according to the current number of visiting people.

The platform of the embodiment of the invention can comprise but is not limited to a live broadcast platform, wherein a main broadcast carries out video live broadcast in a live broadcast room of the main broadcast, and when at least one service object starts to be displayed in a live broadcast program, the current number of persons accessing the video in the live broadcast room of the live broadcast platform is obtained through a system interface (until the service object is displayed).

And determining the number of the current access people of the acquired live broadcast room video as the triggering operation accumulated information of the business object in the display time period.

Step 210, acquiring the accumulated information of the trigger action of the anchor in the video through action detection; and determining the trigger operation accumulated information of at least one business object in the video according to the trigger action accumulated information.

In the embodiment of the present invention, the trigger action of the anchor and the displayed at least one service object may also be linked, and the display effect of the service object is increased by the trigger action of the anchor, that is, the trigger action of the anchor is also used as a part for determining the display effect of the service object.

In the display time period, the trigger action of the anchor is monitored in real time, the accumulated information of the trigger action of the anchor is counted, if a business object is an advertisement sold in Mei Tuo, the trigger action of the anchor triggers a certain reward when the advertisement is displayed, and if the advertisement is displayed, coupons are randomly issued, the more the trigger actions of the anchor are accumulated, the more the coupons are issued; such as the action of a main player jumping.

And determining the acquired trigger action accumulated information of the anchor as trigger operation accumulated information of the business object within the display time period.

It should be noted that, in the embodiment of the present invention, the steps 206, 208, and 210 are optional steps, the steps 206, 208, and 210 may be combined arbitrarily, and one, two, or three of the steps 206, 208, and 210 may be selected arbitrarily according to actual requirements of the solution to perform the solution.

And step 212, updating the information of the information display object according to the triggering operation accumulated information, so that the updated information of the information display object corresponds to the triggering operation accumulated information of the business object.

The server converts the trigger operation accumulated information into information of a display object according to a preset conversion rule, wherein the trigger operation accumulated information at least comprises one of the following information: the number of clicks of the business object, the triggering action accumulation information of the anchor or the current number of visitors of the video.

And converting the trigger operation accumulated information into corresponding information of the information display object according to a set rule. The set conversion rule can be set by a live broadcast platform operator and an advertisement agreement, or by the live broadcast platform operator, for example, in the display time of the service object, the number of persons (number of visitors) entering the live broadcast room during the period is determined according to the user ID, the number of visitors is determined as the number of persons watching the service object, and when a new number of persons enters the live broadcast room, the service object benefits by 1; and determining the current clicked times of the business object, adding 1 to the profit of the business object when the business object is clicked once, accumulating the number of visitors and the current clicked times, and so on to determine the current information reflecting the profit of the business object. In an alternative scheme of the embodiment of the present invention, the number of visitors currently accessing the live video, the number of clicks of the displayed business object, and the trigger action cumulative information of the anchor are respectively obtained in real time or at set time intervals (e.g. 2S), and are converted into corresponding information reflecting the current display effect of the business object according to the set rule, and the information of the current display effect is determined as the information of the information display object to be updated.

In the embodiment of the present invention, the information presentation object may include, but is not limited to, revenue information of the business object, such as current information representing revenue of the business object in the form of presentation currency, which is similar to virtual currency in a live broadcast application, but is not currency with a consuming function and only reflects revenue of the business object (advertisement), so-called revenue is not charging for the advertisement, but is a presentation effect of the business object during presentation.

In the embodiment of the present invention, the number of times of displaying each business object is at least one, and after each business object is displayed, the ID information of each business object and the corresponding display information (information of the information display object) are correspondingly stored, and the information is used as the historical data for adjusting the next display of the business object.

And step 214, adjusting the service object to be displayed in the live video according to the historical data of the information display object.

The anchor or the advertiser can obtain the historical data of the information of the display object by calling the historical display data of the service object, and adjust the service object to be displayed in the live video, for example, adjust the service object by triggering a modification instruction; the specific adjustment may include: the method comprises the steps of adjusting the display duration of a business object to be displayed in a live video, adjusting the display position of the business object, adjusting the display time period of the business object to be displayed in the live video, or adjusting the type of the business object to be displayed in the live video.

For example, taking live video as an example, an information display object is set in an interface of an anchor end, and specifically, the information display object may be displayed in the form of display coins, after an advertisement a starts to be displayed, the number of visitors accessing a current advertisement and the number of currently clicked times are obtained in real time and converted into corresponding display coins, the number of the display coins is updated every 20 seconds, for example, after the advertisement a is displayed, the amount of the corresponding display coins is 1000, and the number of visitors accessing a live broadcast room is 900, that is, only about 100 people click on the advertisement a, the display time of the advertisement can be increased for an advertisement with a small click amount, for example, the number of advertisement display times is increased in a live video, and certainly, the advertisement can also be determined to be changed immediately. According to the information displayed by the information display object, the anchor or the advertiser can determine the adjustment strategy of the corresponding business object and adjust the delivery of the corresponding business object according to the adjustment strategy.

The embodiment of the invention respectively determines the corresponding position information of the business object and the information display object in the video according to the set rule, and respectively draws the business object and the information display object in the video by adopting a computer drawing mode according to the position information; acquiring the number of times of clicking accumulated on a business object in a video within a time period; determining the accumulated information of the trigger operation of the business object in the video at least according to the accumulated times of clicking, and acquiring the accumulated information of the trigger action of the anchor in the video through action detection; determining the triggering operation accumulated information of the business object in the video at least according to the triggering action accumulated information, and acquiring the current number of visitors of the video; and determining the triggering operation accumulated information of the business object in the video at least according to the current number of people visiting the video, and updating the information of the information display object according to the triggering operation accumulated information so as to enable the updated information of the information display object to correspond to the triggering operation accumulated information of the business object. The business object of the embodiment of the invention is added to the live video picture in a special effect mode, thereby reducing the investment of advertisement cost, the display condition of the business object in the live broadcasting process can be clearly reflected by setting the information display object, the triggering operation of the business object is accumulated and counted to obtain corresponding accumulated information, and the information of the information display object is determined according to the accumulated information. The information based on the information display object can effectively reflect the display condition of the current business object. The method can also adjust the business object to be displayed in the live video according to the historical data of the information display object, determine the adjustment strategy of the business object, and effectively utilize the live video to display the business object.

Those skilled in the art will understand that, in the above method according to the embodiment of the present invention, the sequence number of each step does not mean the execution sequence, and the execution sequence of each step should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present invention.

EXAMPLE III

Referring to fig. 3, a block diagram of a video image processing apparatus according to a third embodiment of the present invention is shown; the method specifically comprises the following modules:

the drawing module 302 is configured to draw the information display object and the at least one service object in the video in a computer drawing manner.

The obtaining module 304 is configured to obtain accumulated information of trigger operations on at least one service object in the video.

The updating module 306 is configured to update the information of the information display object according to the trigger operation cumulative information, so that the updated information of the information display object corresponds to the trigger operation cumulative information of the at least one service object.

Example four

Referring to fig. 4, a block diagram of a video image processing apparatus according to a fourth embodiment of the present invention is shown, which may specifically include the following modules:

a drawing module 302, configured to determine position information corresponding to at least one service object and an information display object in a video; and respectively drawing the business object and the information display object in the video in a computer drawing mode according to the position information.

As an improvement, the drawing module is used for determining feature points of a target object from the video, and determining drawing position information of a business object to be drawn in the video image by using a pre-trained convolution network model for determining the drawing position of the business object in the video image according to the feature points of the target object; and/or determining the type of the target object from the video, and determining the drawing position information of the service object to be drawn according to the type of the target object.

When the type of the target object is a face type, determining drawing position information of the business object to be drawn includes at least one of the following: a hair region, a forehead region, a cheek region, a chin region, and a body region other than the head of the person in the video image; and/or when the type of the target object is a background type, determining the drawing position information of the service object to be drawn comprises: a background region in the video image; and/or when the type of the target object is a hand type, determining the drawing position information of the business object to be drawn comprises the following steps: the area in the set range, which takes the area where the hand is located in the video image as the center; and/or when the type of the target object is the action type, determining the drawing position information of the service object to be drawn comprises the following steps: a predetermined area in the video image. The drawing position of the information presentation object includes any region other than the person in the video image.

The obtaining module 304 includes: a first obtaining submodule 3042, configured to obtain the cumulative number of clicks on at least one service object in the video in a time period; and determining the accumulated information of the trigger operation of at least one business object in the video at least according to the accumulated times of clicks. A second obtaining submodule 3044, configured to obtain, through motion detection, trigger motion accumulation information of a main broadcast in a video; and determining the accumulated information of the trigger operation of at least one business object in the video at least according to the accumulated information of the trigger action. A third obtaining submodule 3046 for obtaining the current number of visitors of the video; and determining the accumulated information of the triggering operation of at least one business object in the video at least according to the current number of visitors.

The adjusting module 308 is configured to adjust a service object to be displayed in the video according to the historical data of the information display object.

As an improvement, the adjusting module 308 is configured to adjust a display duration and/or a display position of the business object to be displayed in the video.

Wherein, the business object comprises: the video live broadcast comprises the video live broadcast in a live broadcast platform.

Wherein, the business object comprises: special effects containing advertising information in at least one of the following forms: two-dimensional paster special effect, three-dimensional special effect and particle special effect.

EXAMPLE five

Fig. 5 is a schematic structural diagram of a terminal device according to a fifth embodiment of the present invention, where the specific embodiment of the present invention does not limit the specific implementation of the terminal device.

As shown in fig. 5, the terminal device 500 may include:

a processor (processor)502, a Communications Interface 504, a memory 506, and a communication bus 508. Wherein:

the processor 502, communication interface 504, and memory 506 communicate with one another via a communication bus 508.

A communication interface 504 for communication between the server and the client.

The processor 502 is configured to execute the program 510, and may specifically perform the relevant steps in the above method embodiments.

In particular, program 510 may include program code comprising computer operating instructions.

The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.

And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. The program 510 may specifically be used to cause the processor 502 to perform the following operations: respectively drawing an information display object and at least one business object in a video in a computer drawing mode; acquiring accumulated information of trigger operation on a service object in a video; and updating the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of the business object.

In an alternative embodiment, the program 510 is further configured to enable the processor 502 to obtain a cumulative number of clicks on at least one business object in the video within a period of time; and determining the accumulated information of the trigger operation of the business object in the video at least according to the accumulated times of clicking.

In an alternative embodiment, the program 510 is further configured to enable the processor 502 to obtain accumulated information of trigger actions of the anchor in the video through action detection; and determining the accumulated information of the trigger operation of at least one business object in the video at least according to the accumulated information of the trigger action.

In an alternative embodiment, the program 510 is further configured to cause the processor 502 to obtain a current number of visitors to the video; and determining the accumulated information of the triggering operation of at least one business object in the video at least according to the current number of visitors.

In an alternative embodiment, the program 510 is further configured to enable the processor 502 to adjust at least one service object to be displayed in the video according to the historical data of the information display object.

In an alternative embodiment, the program 510 is further configured to enable the processor 502 to adjust a display duration and/or a display position of the business object to be displayed in the video.

In an alternative embodiment, the program 510 is further configured to enable the processor 502 to determine feature points of a target object from the video, and determine, according to the feature points of the target object, drawing position information of a business object to be drawn in the video image by using a pre-trained convolution network model for determining a drawing position of the business object in the video image; or, determining the type of the target object from the video, and determining the drawing position information of the service object to be drawn according to the type of the target object.

In an alternative embodiment, program 510 is further configured to cause processor 502 to determine rendering position information of the business object to be rendered when the type of the target object is a face type, including at least one of: a hair region, a forehead region, a cheek region, a chin region, and a body region other than the head of the person in the video image; and/or when the type of the target object is a background type, determining the drawing position information of the service object to be drawn comprises: a background region in the video image; and/or when the type of the target object is a hand type, determining the drawing position information of the business object to be drawn comprises the following steps: the area in the set range, which takes the area where the hand is located in the video image as the center; and/or when the type of the target object is the action type, determining the drawing position information of the service object to be drawn comprises the following steps: a predetermined area in the video image.

In an alternative embodiment, the program 510 is further configured to cause the processor 502 to configure the drawing position of the information presentation object to include any region other than the person in the video image.

In an alternative embodiment, program 510 is further operative to cause processor 502 to configure the business object including: the video live broadcast comprises the video live broadcast in a live broadcast platform.

In an alternative embodiment, program 510 is further operative to cause processor 502 to configure the business object including: special effects containing advertising information in at least one of the following forms: two-dimensional paster special effect, three-dimensional special effect and particle special effect.

The embodiment of the invention respectively determines the corresponding position information of the business object and the information display object in the video according to the set rule, and respectively draws the business object and the information display object in the video by adopting a computer drawing mode according to the position information; acquiring the number of times of clicking accumulated on a business object in a video within a time period; determining the accumulated information of the trigger operation of the business object in the video at least according to the accumulated times of clicking, and acquiring the accumulated information of the trigger action of the anchor in the video through action detection; determining the triggering operation accumulated information of the business object in the video at least according to the triggering action accumulated information, and acquiring the current number of visitors of the video; and determining the triggering operation accumulated information of the business object in the video at least according to the current number of people visiting the video, and updating the information of the information display object according to the triggering operation accumulated information so as to enable the updated information of the information display object to correspond to the triggering operation accumulated information of the business object. The business object of the embodiment of the invention is added to the live video picture in a special effect mode, thereby reducing the investment of advertisement cost, the display condition of the business object in the live broadcasting process can be clearly reflected by setting the information display object, the triggering operation of the business object is accumulated and counted to obtain corresponding accumulated information, and the information of the information display object is determined according to the accumulated information. The information based on the information display object can effectively reflect the display condition of the current business object. The method can also adjust the business object to be displayed in the live video according to the historical data of the information display object, determine the adjustment strategy of the business object, and effectively utilize the live video to display the business object. It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present invention may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present invention.

The above-described method according to an embodiment of the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the method described herein may be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the processing methods described herein. Further, when a general-purpose computer accesses code for implementing the processes shown herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the processes shown herein.

Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.

The above embodiments are only for illustrating the embodiments of the present invention and not for limiting the embodiments of the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present invention, so that all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims

1. A video image processing method, comprising:

respectively drawing an information display object and at least one business object in a video by adopting a computer drawing mode, wherein the information display object is used for reflecting the information of the operation of the business object; the business object comprises a special effect containing advertisement information;

acquiring accumulated information of trigger operation of the at least one service object in the video;

updating the information of the information display object according to the triggering operation accumulated information so that the updated information of the information display object corresponds to the triggering operation accumulated information of the at least one business object;

adjusting the at least one service object to be displayed in the video according to the information displayed by the information display object,

wherein, said respectively drawing information display object and at least one service object in video by computer drawing mode comprises:

determining drawing position information corresponding to an information display object and at least one business object in a video, wherein the drawing position information of the at least one business object is determined by the type of a target object in the video or the characteristic point of the target object;

and respectively drawing the business object and the information display object in the video by adopting a computer drawing mode according to the drawing position information.

2. The method of claim 1, wherein obtaining accumulated information of trigger operations for the at least one service object in the video comprises:

acquiring the accumulated times of clicks of the at least one service object in the video within a period of time;

and determining the accumulated information of the trigger operation of the at least one service object in the video at least according to the accumulated times of clicks.

3. The method of claim 1, wherein obtaining accumulated information of trigger operations for the at least one service object in the video comprises:

acquiring trigger action accumulated information of a main broadcast in the video through action detection;

and determining the accumulated information of the trigger operation of the at least one business object in the video at least according to the accumulated information of the trigger action.

4. The method of claim 1, wherein obtaining accumulated information of trigger operations for the at least one service object in the video comprises:

acquiring the current number of visiting persons of the video;

and determining the accumulated information of the triggering operation of the at least one business object in the video at least according to the current number of visitors.

5. The method of any of claims 1-4, further comprising:

and adjusting the at least one service object to be displayed in the video according to the historical data of the information display object.

6. The method of claim 5, wherein the adjusting the at least one business object to be shown in the video comprises:

and adjusting the display duration and/or the display position of the at least one service object to be displayed in the video.

7. The method according to any one of claims 1 to 4, wherein the determining drawing position information corresponding to the information presentation object and the at least one service object in the video comprises:

determining feature points of a target object from the video, and determining drawing position information of a service object to be drawn in the video image by using a pre-trained convolution network model for determining the drawing position of the service object in the video image according to the feature points of the target object;

or the like, or, alternatively,

and determining the type of the target object from the video, and determining the drawing position information of the service object to be drawn according to the type of the target object.

8. The method according to claim 7, wherein the determining drawing position information of the business object to be drawn according to the type of the target object comprises:

when the type of the target object is a face type, determining that the drawing position information of the business object to be drawn comprises at least one of the following: a hair region, a forehead region, a cheek region, a chin region, and a body region other than the head of the person in the video image; and/or the presence of a gas in the gas,

when the type of the target object is an action type, determining drawing position information of the service object to be drawn comprises: a predetermined area in the video image.

9. The method according to any one of claims 1 to 4, wherein the drawing position of the information presentation object includes any region other than a person in the video image.

10. The method of any of claims 1-4, wherein the video comprises a live video in a live platform.

11. The method according to any of claims 1-4, wherein the business object comprises: special effects containing advertising information in at least one of the following forms: two-dimensional paster special effect, three-dimensional special effect and particle special effect.

12. A video image processing apparatus comprising:

the system comprises a drawing module, a display module and a display module, wherein the drawing module is used for respectively drawing an information display object and at least one service object in a video by adopting a computer drawing mode, and the information display object is used for reflecting the information of the operation of the service object; the business object comprises a special effect containing advertisement information;

the acquisition module is used for acquiring accumulated information of the trigger operation of the at least one service object in the video;

the updating module is used for updating the information of the information display object according to the triggering operation accumulated information so as to enable the updated information of the information display object to correspond to the triggering operation accumulated information of the at least one business object;

a first adjusting module; for adjusting the at least one service object to be displayed in the video according to the information displayed by the information display object,

wherein the rendering module comprises:

the position determining unit is used for determining the corresponding drawing position information of the information display object and at least one business object in the video, and the drawing position information of the at least one business object is determined by the type of a target object in the video or the characteristic point of the target object;

and the drawing unit is used for respectively drawing the business object and the information display object in the video by adopting a computer drawing mode according to the drawing position information.

13. The apparatus of claim 12, wherein the obtaining module comprises:

the first obtaining submodule is used for obtaining the click accumulated times of the at least one service object in the video within a time period; and determining the accumulated information of the trigger operation of the at least one service object in the video at least according to the accumulated times of clicks.

14. The apparatus of claim 12, wherein the obtaining module further comprises:

the second acquisition submodule is used for acquiring the trigger action accumulated information of the anchor in the video through action detection; and determining the accumulated information of the trigger operation of the at least one business object in the video at least according to the accumulated information of the trigger action.

15. The apparatus of claim 12, wherein the obtaining module further comprises:

the third obtaining submodule is used for obtaining the current number of visitors of the video; and determining the accumulated information of the triggering operation of the at least one business object in the video at least according to the current number of visitors.

16. The apparatus of any of claims 12-15, further comprising:

and the second adjusting module is used for adjusting the at least one service object to be displayed in the video according to the historical data of the information display object.

17. The apparatus of claim 16,

the second adjusting module is used for adjusting the display duration and/or the display position of the business object to be displayed in the video.

18. The apparatus according to any one of claims 12 to 15,

the drawing module is used for determining the characteristic points of the target object from the video, and determining the drawing position information of the business object to be drawn in the video image by using a pre-trained convolution network model for determining the drawing position of the business object in the video image according to the characteristic points of the target object; or, determining the type of the target object from the video, and determining the drawing position information of the service object to be drawn according to the type of the target object.

19. The apparatus of claim 18,

the drawing module is configured to determine that drawing position information of the business object to be drawn includes at least one of the following when the type of the target object is a face type: a hair region, a forehead region, a cheek region, a chin region, and a body region other than the head of the person in the video image; and/or the presence of a gas in the gas,

20. The apparatus according to any one of claims 12 to 15, wherein the drawing position of the information presentation object includes any region other than a person in the video image.

21. The apparatus of any of claims 12-15, wherein the video comprises a live video in a live platform.

22. The apparatus according to any of claims 12-15, wherein the business object comprises: special effects containing advertising information in at least one of the following forms: two-dimensional paster special effect, three-dimensional special effect and particle special effect.

23. A terminal device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is adapted to store at least one computer program for causing the processor to perform the video image processing method according to any one of claims 1 to 11.