CN111654747B

CN111654747B - Bullet screen display method and device

Info

Publication number: CN111654747B
Application number: CN202010540299.5A
Authority: CN
Inventors: 王�琦
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2022-07-26
Anticipated expiration: 2040-06-12
Also published as: CN111654747A

Abstract

The invention provides a bullet screen display method, which comprises the following steps: acquiring a video to be processed; calculating the time domain significance and the space domain significance of the video image according to the image characteristics of each frame in the video; generating a binary mask according to the time domain significance and the space domain significance, wherein the binary mask comprises an unoccluded area and an occluded area; according to the position information of the preset bullet screen, judging the non-shielding area of the bullet screen in the corresponding binaryzation mask, and hiding the bullet screen in the non-shielding area. Compared with the prior art which is limited to the artificial fixed scene, the bullet screen display method can be applied to the non-fixed scene and the video of the non-fixed target type, and has wider application range.

Description

Bullet screen display method and device

Technical Field

The invention relates to the technical field of bullet screen display, in particular to a bullet screen display method and device.

Background

Current bullet screen display technology needs the predetermined specific region, shelters from the bullet screen of specific area, and the bullet screen beyond this specific area shows. However, if an area in the video image that requires attention to be focused appears outside a specific area, this processing manner fails. Based on the current bullet screen display technology, it is difficult to form effective processing for a variety of image contents that need to be processed.

Disclosure of Invention

The bullet screen display method and the bullet screen display device can effectively process the bullet screen of the non-fixed scene.

In a first aspect, the present invention provides a bullet screen display method, including:

acquiring a video to be processed;

calculating the time domain significance and the space domain significance of the video image according to the image characteristics of each frame in the video;

generating a binarization mask according to the time domain significance and the space domain significance, wherein the binarization mask comprises an unoccluded area and an occluded area;

according to the position information of the preset bullet screen, judging the non-shielding area of the bullet screen in the corresponding binaryzation mask, and hiding the bullet screen in the non-shielding area.

Optionally, the calculating the temporal saliency and the spatial saliency of the video image according to the image feature of each frame in the video includes:

calculating the spatial domain significance of each frame of the video according to the brightness, the chroma and the texture characteristics;

selecting two frames separated by a preset frame from a video to be processed, and extracting optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency.

Optionally, the calculating a unit modulus value of the optical flow as the temporal saliency includes, from the optical flow characteristics of the foreground object of the next frame with respect to the foreground object of the previous frame among two frames spaced by a predetermined frame:

dividing each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value;

selecting two frames of a frame set of the same scene at intervals of a preset frame, extracting the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency.

Optionally, the generating a binarization mask according to the temporal saliency and the spatial saliency includes:

weighting the spatial domain significance and the time domain significance to obtain a significance score;

sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest;

and generating a binary mask according to the region of interest.

Optionally, the selecting two frames separated by a predetermined frame in the set of frames of the same scene, and the extracting optical flow features of foreground objects of a subsequent frame relative to foreground objects of a previous frame includes:

and sending the images of the previous frame and the next frame into an optical flow extraction network to obtain the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame.

Optionally, the calculating the spatial saliency of each frame of the video according to the luminance, the chrominance and the texture features comprises:

dividing a video image into a plurality of pixel blocks;

calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness;

calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block;

calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma;

calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block;

calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature;

calculating the average value of the difference values of the first average textural feature and the second average textural feature as the significance of the textural feature of the current pixel block;

and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block.

The bullet screen display method of the invention divides the picture blocks according to the target detection and scene segmentation, respectively calculates the spatial domain significance of the foreground blocks and the temporal domain significance of the background blocks for weighting, and obtains the interested area of the audience. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display method does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos compared with the prior art which is limited to the man-made fixed scene, so that the application range is wider.

In a second aspect, the present invention provides a bullet screen display device, including:

the video preprocessing module is used for acquiring a video to be processed and calculating the time domain significance and the space domain significance of a video image according to the image characteristics of each frame in the video;

the bullet screen processing module is used for generating a binary mask according to the time domain significance and the space domain significance, and the binary mask comprises an unoccluded area and an occluded area;

bullet screen display module for according to the positional information of predetermined bullet screen, judge bullet screen does not shelter from the region or shelter from the region in corresponding binary mask, will be in and do not shelter from regional bullet screen and hide, will be in and shelter from regional bullet screen and carry out normal demonstration.

Optionally, the video pre-processing module includes:

the spatial domain significance submodule is used for calculating the spatial domain significance of each frame of the video according to the brightness, the chroma and the texture characteristics;

the time domain saliency submodule selects two frames with preset frame intervals from a video to be processed and extracts the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating a unit modulus value of the optical flow according to the optical flow characteristic, and taking the unit modulus value of the optical flow as a time domain saliency.

Optionally, the temporal saliency module comprises:

the scene segmentation unit is used for segmenting each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value;

and the saliency calculation unit is used for selecting two frames which are separated by a preset frame in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency.

The bullet screen display device divides picture blocks according to target detection and scene segmentation, calculates the spatial domain significance of a foreground block and the temporal domain significance of a background block respectively, and weights the spatial domain significance and the temporal significance to obtain the region of interest of audiences. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display device does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos compared with the prior art which is limited to a man-made fixed scene, so that the bullet screen display device has a wider application range.

Drawings

FIG. 1 is a flowchart illustrating a bullet screen display method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a method for determining an area of interest in a bullet screen display method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating a method for changing a bullet screen display mode in a bullet screen display method according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

Example 1

As shown in fig. 1, the present embodiment provides a bullet screen display method, including:

s1, acquiring a video to be processed;

the video to be processed may be a live video, or a played and previously recorded video, and is not limited herein.

S2, calculating the time domain saliency and the space domain saliency of the video image according to the image characteristics of each frame in the video;

as shown in fig. 2: as an optional implementation manner of the step, the spatial saliency of each frame of the video is calculated according to the brightness, the chrominance and the texture characteristics; as a specific optional implementation, the video image is divided into a plurality of pixel blocks; calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness; calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block; calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma; calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block; calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature; calculating the average value of the difference values of the first average textural feature and the second average textural feature as the significance of the textural feature of the current pixel block; and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block. For example, a spatial saliency may be obtained by dividing a video image into several pixel blocks of equal size, and using the average of luminance differences between the central pixel block and 8 pixel blocks around the central pixel block as the central pixel block. The chrominance saliency and the texture feature saliency may be calculated in the same manner as the spatial saliency described above. As an alternative embodiment, the weighted calculation of the luminance saliency, the chrominance saliency, and the texture saliency described above may be used as the spatial saliency.

Continuing with fig. 2, as an alternative embodiment of this step, two frames separated by a predetermined frame are selected from the video to be processed, and the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame in the two frames are extracted; and calculating the unit modulus value of the optical flow according to the optical flow feature, and taking the unit modulus value of the optical flow as the time domain saliency. As a specific optional implementation, dividing each frame into a predetermined number of regions to be processed according to a scene type, calculating a distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a predetermined threshold; selecting two frames with preset frames in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit module value of the optical flow as the time domain saliency. As a specific optional implementation mode, the images of the previous frame and the next frame are sent to an optical flow extraction network, and the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame are obtained. In the above-described embodiment, the optical flow extraction network may be a downsampled optical flow extraction network.

An exemplary temporal saliency acquisition process as an alternative embodiment to the above is to compare scene segmentation results of video frames spaced by K frames. The result of scene segmentation is MASK with discrete pixel values having the same resolution as the original video, and the pixel value at a certain position is the area type corresponding to the store (for example, green belt is 0, road is 1, and sidewalk is 2). The same scene can be considered when the distance between the two MASKs is less than a certain threshold. In the same scene, the temporal saliency of the (N + K) th frame is calculated by acquiring the optical flow characteristics of the salient foreground objects contained in the background segmentation block. And (4) down-sampling the video images of the Nth frame and the (N + K) th frame, sending the down-sampled video images into an optical flow extraction network to obtain optical flow characteristics corresponding to the foreground area, and calculating a unit module value of the optical flow to be used as a time domain saliency.

S3, generating a binary mask according to the time domain significance and the space domain significance, wherein the binary mask comprises an unoccluded area and an occluded area;

continuing with fig. 2, as an optional implementation of this step, weighting the spatial saliency and the temporal saliency to obtain a saliency score; sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest; and generating a binary mask according to the region of interest. The region of interest and other regions are represented by 0 and 1 respectively by the binarization mask, wherein 0 represents a non-occlusion region, i.e. the bullet screen is not allowed to occlude the video content, and 1 represents an occlusion region, i.e. the bullet screen is allowed to occlude the video content.

S4, judging the non-shielding area of the bullet screen in the corresponding binary mask according to the preset position information of the bullet screen, and hiding the bullet screen in the non-shielding area.

As shown in fig. 3, when playing a video, the bullet screen moves on the display device, when the bullet screen moves to the non-occlusion area corresponding to the binary mask, because the area is the area of interest for the user, the user usually does not want the area to be occluded, therefore, the bullet screen corresponding to the non-occlusion area is hidden, when the bullet screen moves to the occlusion area corresponding to the binary mask, the user usually does not have high attention to the area, at this time, even if the bullet screen is occluded, the user experience is not reduced, and therefore, the bullet screen corresponding to the occlusion area is displayed.

In the embodiment, the method may be applied to electronic devices capable of playing videos, such as smart phones, notebooks, desktop computers, tablet computers, and the like, and is not limited herein. Preferably, the method provided by the embodiment is applied to the electronic device for installing the android system.

According to the bullet screen display method, the picture blocks are divided according to target detection and scene segmentation, the spatial domain significance of the foreground blocks and the temporal significance of the background blocks are respectively calculated and weighted, and the interested area of the audience is obtained. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote to watch and experience. The bullet screen display method does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos, and the application range is wider compared with the method that the prior art is limited to artificial fixed scenes.

Example 2

The present embodiment provides a bullet screen display device, including:

as an optional implementation, the video pre-processing module includes:

the spatial domain significance submodule is used for calculating the spatial domain significance of each frame of the video according to the brightness, the chroma and the texture characteristics; as a specific optional embodiment, the process of calculating the spatial saliency by the spatial saliency submodule is as follows: the calculating the spatial domain saliency of each frame of the video according to the brightness, the chrominance and the texture characteristics comprises the following steps: dividing a video image into a plurality of pixel blocks; calculating the average value of the brightness of the current pixel block as a first average brightness, and calculating the average value of the brightness of the pixel blocks around the current pixel block as a second average brightness; calculating the average value of the difference values of the first average brightness and the second average brightness as the brightness significance of the current pixel block; calculating the average value of the chroma of the current pixel block as a first average chroma, and calculating the average value of the chroma of the pixel blocks around the current pixel block as a second average chroma; calculating the average value of the difference values of the first average chroma and the second average chroma as the chroma significance of the current pixel block; calculating the average value of the texture features of the current pixel block as a first average texture feature, and calculating the average value of the texture features of the pixel blocks around the current pixel block as a second average texture feature; calculating the average value of the difference values of the first average textural feature and the second average textural feature as the textural feature significance of the current pixel block; and performing weighted calculation on the brightness significance, the chroma significance and the texture feature significance to obtain the spatial domain significance of the current pixel block. For example, a spatial saliency may be obtained by dividing a video image into several pixel blocks of equal size, and using the average of luminance differences between the central pixel block and 8 pixel blocks around the central pixel block as the central pixel block. The chroma saliency and the texture saliency may be calculated in the same manner as the spatial saliency described above. As an alternative embodiment, the weighted calculation of the luminance saliency, the chrominance saliency, and the texture saliency described above may be used as the spatial saliency.

The time domain saliency submodule selects two frames separated by a preset frame from a video to be processed and extracts the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating a unit modulus value of the optical flow according to the optical flow characteristic, and taking the unit modulus value of the optical flow as a time domain saliency. As an optional implementation, the temporal saliency submodule includes: the scene segmentation unit is used for segmenting each frame into a preset number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a preset threshold value; and the saliency calculation unit is used for selecting two frames which are separated by a preset frame in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit modulus value of the optical flow as the time domain saliency. As an alternative specific implementation, for example, the scene segmentation unit compares the scene segmentation results of the video frames spaced by K frames. The result of scene segmentation is MASK with discrete pixel values having the same resolution as the original video, and the pixel value at a certain position is the area type corresponding to the store (for example, green belt is 0, road is 1, and sidewalk is 2). The same scene can be considered when the distance between the two MASKs is less than a certain threshold. In the same scene, the saliency calculation unit calculates the temporal saliency of the (N + K) th frame by acquiring the optical flow features of the saliency foreground objects contained in the background segmentation blocks. And (4) down-sampling the video images of the Nth frame and the (N + K) th frame, sending the down-sampled video images into an optical flow extraction network to obtain optical flow characteristics corresponding to the foreground area, and calculating a unit module value of the optical flow to serve as a time domain saliency.

The bullet screen processing module is used for generating a binarization mask according to the time domain significance and the space domain significance, and the binarization mask comprises an unoccluded area and an occluded area; as a specific optional implementation, weighting the spatial saliency and the temporal saliency to obtain a saliency score; sequencing the regions to be processed in the same frame according to the significance scores, and selecting a plurality of sequenced regions to be processed as regions of interest; and generating a binary mask according to the region of interest. The region of interest and other regions are represented by 0 and 1 respectively by the binarization mask, wherein 0 represents a non-occlusion region, i.e. the bullet screen is not allowed to occlude the video content, and 1 represents an occlusion region, i.e. the bullet screen is allowed to occlude the video content.

Bullet screen display module for according to the positional information of predetermined bullet screen, judge bullet screen does not shelter from the region or shelter from the region in corresponding binary mask, will be in and do not shelter from regional bullet screen and hide, will be in and shelter from regional bullet screen and carry out normal demonstration. When playing video, the barrage moves on display device, when the barrage moves to the no-shelter region that corresponds binary mask, because this region is the region of interest of the user, the user does not want this region to be sheltered from usually, therefore, the barrage that will correspond no-shelter region hides, when the barrage moves the shelter region that corresponds binary mask, the user can not have higher attention to this region usually, at this moment, even sheltered from by the barrage and can not reduce user's experience yet, consequently, the barrage that will correspond shelter from the region shows.

The bullet screen display device of the embodiment divides the picture blocks according to the target detection and the scene segmentation, and respectively calculates the spatial domain significance of the foreground blocks and the temporal domain significance of the background blocks for weighting to obtain the interested area of the audience. Through judging the position relation in bullet curtain and this region, change the display mode of bullet curtain, promote and watch experience. The bullet screen display device does not need to train a specific type of detection model aiming at the video type, does not need to set the shielding-free content in advance according to the video content, and can be applied to non-fixed scenes and non-fixed target type videos with a wider application range compared with the method that the prior art is limited to artificial fixed scenes.

It will be understood by those skilled in the art that all or part of the processes of the embodiments of the methods described above may be implemented by a computer program, which may be stored in a computer readable storage medium and executed by a computer, and the processes of the embodiments of the methods described above may be included in the programs. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A bullet screen display method is characterized in that: the method comprises the following steps:

acquiring a video to be processed, and performing picture division on each frame of image according to target detection and scene segmentation to form a foreground block and a background block;

calculating the time domain significance of a background block and the space domain significance of a foreground block of a video image according to the image characteristics of each frame in the video;

generating a binarization mask according to the time domain significance of the background block and the space domain significance of the foreground block, wherein the binarization mask comprises an unoccluded area and an occluded area;

judging an unobstructed area of the bullet screen in the corresponding binary mask according to preset position information of the bullet screen, and hiding the bullet screen in the unobstructed area;

wherein, the generating a binary mask according to the time domain significance and the space domain significance comprises:

and generating a binary mask according to the region of interest.

2. The bullet screen display method of claim 1, wherein: the calculating the time domain significance and the spatial significance of the video image according to the image characteristics of each frame in the video comprises the following steps:

calculating the spatial domain significance of each frame of the video according to the brightness, the chrominance and the texture characteristics;

3. The bullet screen display method of claim 2, wherein: the calculating the unit module value of the optical flow as the time domain saliency according to the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame in the two frames separated by the preset frame comprises:

dividing each frame into a predetermined number of regions to be processed according to the scene type, calculating the distance between the regions to be processed of the same scene type in the two frames, and dividing the two frames into a set of frames of the same scene when the distance is within a predetermined threshold;

selecting two frames with preset frames in the frame set of the same scene, extracting the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame, and calculating the unit module value of the optical flow as the time domain saliency.

4. The bullet screen display method of claim 3, wherein: the selecting two frames of the frame set of the same scene, which are separated by a predetermined frame, and the extracting of the optical flow characteristics of the foreground object of the next frame relative to the foreground object of the previous frame comprises:

5. The bullet screen display method of claim 2, wherein: the calculating the spatial domain saliency of each frame of the video according to the brightness, the chrominance and the texture characteristics comprises the following steps:

dividing a video image into a plurality of pixel blocks;

6. A bullet screen display device is characterized in that: the method comprises the following steps:

the video preprocessing module is used for acquiring a video to be processed and dividing each frame of image into a foreground block and a background block according to target detection and scene segmentation; calculating the time domain significance of a background block and the space domain significance of a foreground block of the video image according to the image characteristics of each frame in the video;

the bullet screen processing module is used for generating a binary mask according to the time domain significance of the background block and the space domain significance of the foreground block, and the binary mask comprises an unobstructed area and an obstructed area;

the bullet screen display module is used for judging a non-shielding area or a shielding area of the bullet screen in the corresponding binary mask according to preset position information of the bullet screen, hiding the bullet screen in the non-shielding area and normally displaying the bullet screen in the shielding area;

and generating a binary mask according to the region of interest.

7. The bullet screen display device of claim 6 wherein: the video preprocessing module comprises:

the time domain saliency submodule selects two frames separated by a preset frame from a video to be processed and extracts the optical flow characteristics of a foreground object of a next frame relative to a foreground object of a previous frame in the two frames; and calculating a unit modulus value of the optical flow according to the optical flow characteristic, and taking the unit modulus value of the optical flow as a time domain saliency.

8. The bullet screen display device of claim 7, wherein: the temporal saliency sub-module comprises: