CN111654747A

CN111654747A - Bullet screen display method and device

Info

Publication number: CN111654747A
Application number: CN202010540299.5A
Authority: CN
Inventors: 王�琦
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2020-09-11
Anticipated expiration: 2040-06-12
Also published as: CN111654747B

Abstract

The invention provides a bullet screen display method, which comprises the following steps: acquiring a video to be processed; calculating the time domain significance and the space domain significance of the video image according to the image characteristics of each frame in the video; generating a binarization mask according to the time domain significance and the space domain significance, wherein the binarization mask comprises an unoccluded area and an occluded area; and judging the non-shielding area of the bullet screen in the corresponding binaryzation mask according to the preset position information of the bullet screen, and hiding the bullet screen in the non-shielding area. Compared with the prior art which is limited to the man-made fixed scene, the bullet screen display method can be applied to the non-fixed scene and the non-fixed target type video, and has wider application range.

Description

Bullet screen display method and device

Technical Field

The invention relates to the technical field of bullet screen display, in particular to a bullet screen display method and device.

Background

The current bullet screen display technology needs to determine a specific area in advance, shield the bullet screen in the specific area, and display the bullet screen outside the specific area. However, this processing manner fails if an area in the video image, which requires attention to be focused, appears outside a specific area. Based on the current bullet screen display technology, it is difficult to form effective processing for a variety of image contents that need to be processed.

Disclosure of Invention

The bullet screen display method and the bullet screen display device provided by the invention can effectively process bullet screens in non-fixed scenes.

In a first aspect, the present invention provides a bullet screen display method, including:

acquiring a video to be processed;

calculating the time domain significance and the space domain significance of the video image according to the image characteristics of each frame in the video;

generating a binarization mask according to the time domain significance and the space domain significance, wherein the binarization mask comprises an unoccluded area and an occluded area;

and judging the non-shielding area of the bullet screen in the corresponding binaryzation mask according to the preset position information of the bullet screen, and hiding the bullet screen in the non-shielding area.

Optionally, the calculating the temporal saliency and the spatial saliency of the video image according to the image feature of each frame in the video includes:

calculating the spatial domain significance of each frame of the video according to the brightness, the chrominance and the texture characteristics;

selecting two frames with a preset frame interval from a video to be processed, and extracting optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency.

Optionally, the calculating a unit modulus value of the optical flow as the temporal saliency includes, from the optical flow characteristics of the foreground object of the next frame with respect to the foreground object of the previous frame among two frames spaced by a predetermined frame:

dividing each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value;

selecting two frames of a frame set of the same scene at intervals of a preset frame, extracting the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency.

Optionally, the generating a binarization mask according to the temporal saliency and the spatial saliency includes:

weighting the spatial domain significance and the time domain significance to obtain a significance score;

sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest;

and generating a binary mask according to the region of interest.

Optionally, the selecting two frames separated by a predetermined frame in the set of frames of the same scene, and the extracting optical flow features of foreground objects of a subsequent frame relative to foreground objects of a previous frame includes:

and sending the images of the previous frame and the next frame into an optical flow extraction network to obtain the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame.

Optionally, the calculating the spatial saliency of each frame of the video according to the luminance, chrominance and texture features includes:

dividing a video image into a plurality of pixel blocks;

calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness;

calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block;

calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma;

calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block;

calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature;

calculating the average value of the difference values of the first average textural feature and the second average textural feature as the textural feature significance of the current pixel block;

and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block.

The bullet screen display method of the invention divides and divides the picture blocks according to the target detection and the scene segmentation, and respectively calculates the spatial domain significance of the foreground block and the time domain significance of the background block for weighting to obtain the region of interest of the audience. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display method does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos compared with the prior art which is limited to the man-made fixed scene, so that the application range is wider.

In a second aspect, the present invention provides a bullet screen display device, comprising:

the video preprocessing module is used for acquiring a video to be processed and calculating the time domain significance and the space domain significance of a video image according to the image characteristics of each frame in the video;

the bullet screen processing module is used for generating a binarization mask according to the time domain significance and the space domain significance, and the binarization mask comprises an unoccluded area and an occluded area;

bullet screen display module for according to the positional information of predetermined bullet screen, judge bullet screen does not shelter from the region or shelter from the region in corresponding binary mask, will be in and do not shelter from regional bullet screen and hide, will be in and shelter from regional bullet screen and carry out normal demonstration.

Optionally, the video pre-processing module includes:

the spatial domain significance submodule is used for calculating the spatial domain significance of each frame of the video according to the brightness, the chroma and the texture characteristics;

the time domain saliency submodule selects two frames with preset frame intervals from a video to be processed and extracts the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency.

Optionally, the temporal saliency module comprises:

the scene segmentation unit is used for segmenting each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value;

and the saliency calculation unit is used for selecting two frames which are separated by a preset frame in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency.

The bullet screen display device divides picture blocks according to target detection and scene segmentation, calculates the spatial domain significance of a foreground block and the temporal domain significance of a background block respectively, and weights the spatial domain significance and the temporal significance to obtain the region of interest of audiences. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display device does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos compared with the prior art which is limited to a man-made fixed scene, so that the bullet screen display device has a wider application range.

Drawings

FIG. 1 is a flowchart illustrating a bullet screen display method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a method for determining an area of interest in a bullet screen display method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating a method for changing a bullet screen display mode in a bullet screen display method according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

As shown in fig. 1, the present embodiment provides a bullet screen display method, including:

s1, acquiring a video to be processed;

the video to be processed may be a live video, or a played previously recorded video, and is not limited herein.

S2, calculating the time domain saliency and the space domain saliency of the video image according to the image characteristics of each frame in the video;

as shown in fig. 2: as an optional implementation manner of the step, the spatial saliency of each frame of the video is calculated according to the brightness, the chrominance and the texture characteristics; as a specific optional implementation, the video image is divided into a plurality of pixel blocks; calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness; calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block; calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma; calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block; calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature; calculating the average value of the difference values of the first average textural feature and the second average textural feature as the textural feature significance of the current pixel block; and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block. For example, a video image may be divided into a plurality of equal-sized pixel blocks, and the spatial saliency of the central pixel block may be determined as the average of the luminance differences between the central pixel block and 8 pixel blocks around the central pixel block. The chrominance saliency and the texture feature saliency may be calculated in the same manner as the spatial saliency described above. As an alternative embodiment, the weighted calculation of the luminance saliency, the chrominance saliency, and the texture feature saliency described above may be used as the spatial saliency.

Continuing with fig. 2, as an alternative embodiment of this step, two frames separated by a predetermined frame are selected from the video to be processed, and the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame in the two frames are extracted; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency. As a specific optional implementation, dividing each frame into a predetermined number of regions to be processed according to a scene type, calculating a distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a predetermined threshold; selecting two frames of a frame set of the same scene at intervals of a preset frame, extracting the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency. As a specific optional implementation mode, the images of the previous frame and the next frame are sent to an optical flow extraction network, and the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame are obtained. In the above-described embodiment, the optical flow extraction network may be a downsampled optical flow extraction network.

An exemplary temporal saliency acquisition process as an alternative embodiment to the above is to compare scene segmentation results of video frames spaced by K frames. The result of scene segmentation is MASK with discrete pixel values having the same resolution as the original video, and the pixel value at a certain position is the area type corresponding to the store (for example, green belt is 0, road is 1, and sidewalk is 2). The same scene can be considered when the distance between the two MASKs is less than a certain threshold. In the same scene, the temporal saliency of the (N + K) th frame is calculated by acquiring the optical flow characteristics of the salient foreground objects contained in the background segmentation block. And (4) down-sampling the video images of the Nth frame and the (N + K) th frame, sending the down-sampled video images into an optical flow extraction network to obtain optical flow characteristics corresponding to the foreground area, and calculating a unit module value of the optical flow to be used as a time domain saliency.

S3, generating a binary mask according to the time domain significance and the space domain significance, wherein the binary mask comprises an unoccluded area and an occluded area;

continuing with fig. 2, as an optional implementation of this step, weighting the spatial saliency and the temporal saliency to obtain a saliency score; sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest; and generating a binary mask according to the region of interest. The region of interest and other regions are represented by 0 and 1 respectively by the binarization mask, wherein 0 represents a non-occlusion region, i.e. the bullet screen is not allowed to occlude the video content, and 1 represents an occlusion region, i.e. the bullet screen is allowed to occlude the video content.

S4, judging the non-shielding area of the bullet screen in the corresponding binary mask according to the preset position information of the bullet screen, and hiding the bullet screen in the non-shielding area.

As shown in fig. 3, when playing a video, the bullet screen moves on the display device, when the bullet screen moves to the non-occlusion area corresponding to the binary mask, because the area is the area of interest for the user, the user usually does not want the area to be occluded, therefore, the bullet screen corresponding to the non-occlusion area is hidden, when the bullet screen moves to the occlusion area corresponding to the binary mask, the user usually does not have high attention to the area, at this time, even if the bullet screen is occluded, the user experience is not reduced, and therefore, the bullet screen corresponding to the occlusion area is displayed.

In the embodiment, the method may be applied to electronic devices capable of playing videos, such as smart phones, notebooks, desktop computers, tablet computers, and the like, and is not limited herein. Preferably, the method provided by the embodiment is applied to the electronic device for installing the android system.

According to the bullet screen display method, the image blocks are divided according to target detection and scene segmentation, the spatial significance of the foreground block and the temporal significance of the background block are respectively calculated and weighted, and the region of interest of the audience is obtained. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display method does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos in comparison with the prior art which is limited to a man-made fixed scene, so that the bullet screen display method is wider in application range.

Example 2

The present embodiment provides a bullet screen display device, including:

as an optional implementation, the video pre-processing module includes:

the spatial domain significance submodule is used for calculating the spatial domain significance of each frame of the video according to the brightness, the chroma and the texture characteristics; as a specific optional embodiment, the process of calculating the spatial saliency by the spatial saliency submodule is as follows: the calculating the spatial saliency of each frame of the video according to the brightness, the chrominance and the texture characteristics comprises the following steps: dividing a video image into a plurality of pixel blocks; calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness; calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block; calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma; calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block; calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature; calculating the average value of the difference values of the first average textural feature and the second average textural feature as the textural feature significance of the current pixel block; and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block. For example, a video image may be divided into a plurality of equal-sized pixel blocks, and the spatial saliency of the central pixel block may be determined as the average of the luminance differences between the central pixel block and 8 pixel blocks around the central pixel block. The chrominance saliency and the texture feature saliency may be calculated in the same manner as the spatial saliency described above. As an alternative embodiment, the weighted calculation of the luminance saliency, the chrominance saliency, and the texture feature saliency described above may be used as the spatial saliency.

The time domain saliency submodule selects two frames with preset frame intervals from a video to be processed and extracts the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency. As an optional implementation, the temporal saliency sub-module includes: the scene segmentation unit is used for segmenting each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value; and the saliency calculation unit is used for selecting two frames which are separated by a preset frame in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency. As an alternative specific implementation, for example, the scene segmentation unit compares the scene segmentation results of the video frames spaced by K frames. The result of scene segmentation is MASK with discrete pixel values having the same resolution as the original video, and the pixel value at a certain position is the area type corresponding to the store (for example, green belt is 0, road is 1, and sidewalk is 2). The same scene can be considered when the distance between the two MASKs is less than a certain threshold. In the same scene, the saliency calculation unit calculates the temporal saliency of the (N + K) th frame by acquiring the optical flow characteristics of the saliency foreground objects contained in the background segmentation block. And (4) down-sampling the video images of the Nth frame and the (N + K) th frame, sending the down-sampled video images into an optical flow extraction network to obtain optical flow characteristics corresponding to the foreground area, and calculating a unit module value of the optical flow to be used as a time domain saliency.

The bullet screen processing module is used for generating a binarization mask according to the time domain significance and the space domain significance, and the binarization mask comprises an unoccluded area and an occluded area; as a specific optional implementation, weighting the spatial saliency and the temporal saliency to obtain a saliency score; sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest; and generating a binary mask according to the region of interest. The region of interest and other regions are represented by 0 and 1 respectively by the binarization mask, wherein 0 represents a non-occlusion region, i.e. the bullet screen is not allowed to occlude the video content, and 1 represents an occlusion region, i.e. the bullet screen is allowed to occlude the video content.

Bullet screen display module for according to the positional information of predetermined bullet screen, judge bullet screen does not shelter from the region or shelter from the region in corresponding binary mask, will be in and do not shelter from regional bullet screen and hide, will be in and shelter from regional bullet screen and carry out normal demonstration. When playing video, the barrage moves on display device, when the barrage moves to the no-shelter region that corresponds binary mask, because this region is the region of interest of the user, the user does not want this region to be sheltered from usually, therefore, the barrage that will correspond no-shelter region hides, when the barrage moves the shelter region that corresponds binary mask, the user can not have higher attention to this region usually, at this moment, even sheltered from by the barrage and can not reduce user's experience yet, consequently, the barrage that will correspond shelter from the region shows.

The bullet screen display device of the embodiment divides and divides the picture blocks according to the target detection and the scene segmentation, and respectively calculates the spatial significance of the foreground blocks and the temporal significance of the background blocks for weighting to obtain the region of interest of the audience. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display device does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and videos of non-fixed target types in comparison with the prior art which is limited to the artificial fixed scene, so that the bullet screen display device has a wider application range.

It will be understood by those skilled in the art that all or part of the processes of the embodiments of the methods described above may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A bullet screen display method is characterized in that: the method comprises the following steps:

acquiring a video to be processed;

2. The bullet screen display method of claim 1, wherein: the calculating the time domain saliency and the spatial saliency of the video image according to the image characteristics of each frame in the video comprises the following steps:

3. The bullet screen display method of claim 2, wherein: the calculating a unit modulus value of an optical flow as a time domain saliency, from optical flow characteristics of a foreground object of a subsequent frame with respect to a foreground object of a previous frame among two frames spaced by a predetermined frame, includes:

4. The bullet screen display method of claim 3, wherein: the generating of the binarization mask according to the time domain significance and the space domain significance comprises the following steps:

and generating a binary mask according to the region of interest.

5. The bullet screen display method of claim 3, wherein: the selecting two frames of the frame set of the same scene, which are separated by a predetermined frame, and the extracting of the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame comprises:

6. The bullet screen display method of claim 2, wherein: the calculating the spatial saliency of each frame of the video according to the brightness, the chrominance and the texture characteristics comprises the following steps:

dividing a video image into a plurality of pixel blocks;

7. A bullet screen display device which characterized in that: the method comprises the following steps:

8. The bullet screen display device of claim 7 wherein: the video preprocessing module comprises:

9. The bullet screen display device of claim 8 wherein: the temporal saliency sub-module comprises: