WO2020062898A1 - Video foreground target extraction method and apparatus - Google Patents

Video foreground target extraction method and apparatus Download PDF

Info

Publication number
WO2020062898A1
WO2020062898A1 PCT/CN2019/088278 CN2019088278W WO2020062898A1 WO 2020062898 A1 WO2020062898 A1 WO 2020062898A1 CN 2019088278 W CN2019088278 W CN 2019088278W WO 2020062898 A1 WO2020062898 A1 WO 2020062898A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
transparency
foreground
pixels
Prior art date
Application number
PCT/CN2019/088278
Other languages
French (fr)
Chinese (zh)
Inventor
蔡昭权
蔡映雪
陈伽
胡松
黄思博
李慧
胡辉
陈明阳
Original Assignee
惠州学院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 惠州学院 filed Critical 惠州学院
Publication of WO2020062898A1 publication Critical patent/WO2020062898A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20036Morphological image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Definitions

  • the present disclosure belongs to the field of image processing, and particularly relates to a method and a device for extracting video foreground objects.
  • a transparency mask is generated by selecting a color range, and then the foreground target of the video is extracted by using the transparency mask.
  • the present disclosure provides a video foreground target extraction method, including the following steps:
  • I k is the RGB color value of the unknown pixel Z k
  • the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k
  • the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
  • takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
  • S600 Superimpose grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
  • steps S200 to S500 For the second image, perform steps S200 to S500 to determine a first transparency mask of the second image, and use the first transparency mask of the second image as a second transparency mask of the first image. ;
  • a first dividing module configured to divide, for a first image in a video, all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is from the A frame of image extracted from the video;
  • a first metric module configured to: given certain foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
  • I k is the RGB color value of the unknown pixel Z k
  • the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k
  • the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
  • a second metric module configured to: for each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
  • takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
  • a calculation module for calculating an estimated transparency value of each unknown pixel Z k according to the following formula
  • a determining module configured to: according to the transparency estimation value of each unknown pixel Z k Determine a first transparency mask of the first image initially;
  • a second division module configured to superpose the grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
  • Recalling a module for: for the second image, calling the first measurement module, the second measurement module, the calculation module, and the determination module again to determine the first transparency mask of the second image, and The first transparency mask of the second image is used as the second transparency mask of the first image;
  • a correction module configured to: use the second transparency mask of the first image to modify the first transparency mask of the first image
  • An extraction module is configured to extract a foreground object in the first image of the video according to a first transparency mask of the first image obtained by the correction module.
  • FIG. 1 is a schematic diagram of a method according to an embodiment of the present disclosure
  • FIG. 1 is a schematic flowchart of a video foreground target extraction method according to an embodiment of the present disclosure. As shown, the method includes the following steps:
  • the first image when the video foreground object is extracted, the first image may be: when the video is played, in response to a user operation, pausing the current video playback, and immediately intercepting the current frame of the paused picture so that Obtain a first image; the first image may also be: when a video is not played, in response to a user operation, randomly select a certain frame or frames in the video, and use a certain frame image as the first image.
  • this method can be used for foreground target extraction of each frame of image in the video.
  • the first image is a first frame image in a video.
  • I k is the RGB color value of the unknown pixel Z k
  • the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k
  • the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
  • the selection of m can make the corresponding foreground and background pixel pairs be partial samples or exhaust the entire image; as for step S200, it is intended to pass the color of the unknown pixels and the foreground background
  • the color relationship of pixel pairs is used to estimate the transparency of unknown pixels.
  • the selection of m can further combine the characteristics of neighbor pixels and unknown pixels in terms of color, texture, grayscale, brightness, and spatial distance;
  • takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
  • Step S300 uses the credibility to further filter the foreground and background pixel pairs, and is used in the subsequent steps to estimate the unknown pixel transparency by further filtering the foreground and background pixel pairs;
  • this embodiment naturally determines the first transparency mask of the first image naturally; the reason why it is natural is that the transparency mask can be viewed On the grounds Those corresponding pixels selected according to a certain value (or value range);
  • S600 Superimpose grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
  • step S900 the method further includes the following steps:
  • S11001 Binarize the first transparency mask of the first image after the correction of the previous frame, and take a threshold value of 0.5 to obtain a first binary image of the foreground target;
  • the true corresponding pixels in the second binary image are used as all foreground pixel sets F c
  • the false corresponding pixels in the third binary image are used as all background pixel sets B C
  • the remaining pixels are used as all unknown pixel sets Z. C.
  • the first transparency mask of the modified first image divides all foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in the first image corresponding to the current frame, so that it can be used in image processing Strike a balance between accuracy and efficiency; that is, this embodiment has the inherited characteristics: it inherits the transparency mask of the previous frame, and uses the transparency mask to divide the foreground pixel set and background of the next frame The pixel set and the unknown pixel set. In view of the continuity and similarity in the picture content, this division not only relies on the transparency mask of the previous frame but also uses the means of morphological erosion and morphological expansion, which belong to the disclosure An innovation point.
  • step S600 the grayscale information is superimposed on the first image to generate a second image in the following manner:
  • the first image and the third image generate a second image by using the following formula:
  • IM 2 represents the gray value of the k-th pixel on the second image after superimposition
  • x r represents a neighborhood pixel of the k-th pixel x k on the first image
  • N k represents the neighborhood of the neighborhood centered on x k Number of pixels
  • is taken as 0.5.
  • step S800 further includes:
  • the present disclosure also discloses a video foreground target extraction device in another embodiment, including:
  • a first dividing module configured to divide, for a first image in a video, all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is from the A frame of image extracted from the video;
  • a first metric module configured to: given certain foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
  • I k is the RGB color value of the unknown pixel Z k
  • the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k
  • the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
  • a second metric module configured to: for each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
  • a second division module configured to superpose the grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
  • Recalling a module for: for the second image, calling the first measurement module, the second measurement module, the calculation module, and the determination module again to determine the first transparency mask of the second image, and The first transparency mask of the second image is used as the second transparency mask of the first image;
  • a correction module configured to: use the second transparency mask of the first image to modify the first transparency mask of the first image
  • An extraction module is configured to extract a foreground object in the first image of the video according to a first transparency mask of the first image obtained by the correction module.
  • each module may be combined with a processor and a memory to form a system for implementation; however, FIG. 2 is not a hindrance: each module may also have a processing unit itself to implement data processing capabilities.
  • the inheritance calling module is configured to extract each remaining frame image from the video, use it as the first image, and input it to a third partitioning module, where the third partitioning module is configured according to the above
  • the first transparency mask of the corrected first image of one frame divides all foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in the first image corresponding to the current frame; then,
  • the inheritance calling module sequentially calls the first measurement module, the second measurement module, the calculation module, the determination module, the second division module, the recall module, the correction module, and the extraction module in order to extract all foreground targets of the video, where
  • the third division module includes:
  • a second binary image initial unit configured to: use the first binary image as an initial value of the second binary image
  • a second binary image processing unit configured to: perform a morphological erosion operation on the second binary image by using a circular structural element with a size of 3x3, and update the second binary image with the obtained result;
  • the first repeated calling unit is used to repeatedly call the second binary processing unit five times;
  • the third binary image processing unit is configured to perform a morphological expansion operation on the third binary image using a circular structure element of size 3x3, and update the third binary image with the obtained result:
  • a true / false division unit configured to: use the corresponding corresponding pixels in the second binary image last updated by the second binary image processing unit as true to set all foreground pixel sets F c ; Corresponding false pixels in the three binary image are all background pixel sets B C and the remaining pixels are all unknown pixel sets Z C.
  • An average filtering unit configured to: perform average filtering on the first image to obtain a third image
  • a second image generating unit is configured to generate the second image by using the following formula:
  • IM 2 represents the gray value of the k-th pixel on the second image after superimposition
  • x r represents a neighborhood pixel of the k-th pixel x k on the first image
  • N k represents the neighborhood of the neighborhood centered on x k Number of pixels
  • is taken as 0.5.
  • the correction module further includes:
  • Finding an edge unit configured to find the edge of the second transparency mask and the edge of the first transparency mask respectively according to the second transparency mask of the first image and the first transparency mask of the first image;
  • a position determining unit configured to obtain the positions of all pixels at the edges of the second transparency mask and the positions of all pixels at the edges of the first transparency mask, and determine the positions and An area where the positions of all pixels of the edges of the first transparency mask are coincident, and then pixels Z sp having the same position are determined;
  • the first correction unit is configured to find the transparency estimate value of the first transparency mask corresponding to the first image of the pixel Z sp and the transparency estimate value of the second transparency mask corresponding to the first image, respectively.
  • the average value of the pixel is used as the estimated transparency value of the pixel Z sp ;
  • the second correction unit is configured to correct the first transparency mask of the first image by using the pixel Z sp corrected transparency estimation value.
  • the location determining unit further includes:
  • Different position subunits are used to further determine pixels with different positions Z dp according to the positions where all the pixels of the edges of the second transparency mask and the positions of all the pixels of the edges of the first transparency mask overlap, including: : A pixel Z dp2 located at the edge of the second transparency mask and a pixel Z dp1 located at the edge of the first transparency mask;
  • the complex correction subunit is configured to: modify the first transparency mask of the first image in combination with the pixel Z dp1 modified transparency estimation value and the pixel Z dp2 modified transparency estimation value.
  • each functional unit may be integrated into one processing unit, or each unit may exist alone, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit. When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present disclosure essentially or part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium Including instructions for causing a computer device (which may be a smart phone, a personal digital assistant, a wearable device, a notebook computer, or a tablet computer) to perform all or part of the steps of the method described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disks, Read-Only Memory (ROM), Random Access Memory (RAM), mobile hard disks, magnetic disks, or optical disks, and other media that can store program codes .

Abstract

Disclosed are a video foreground target extraction method and apparatus. In the method, firstly, a transparency estimation value is recalculated by measuring the reliability of foreground and background pixel pairs to obtain a first transparency mask of the first image; then, a new picture is generated by superposing grayscale information and a second transparency mask of the first image is obtained, and the first transparency mask of the first image is further modified, finally, the foreground object of a certain frame of image in a video is extracted by using the modified first transparency mask. The present disclosure can comprehensively use the reliability and grayscale information of the foreground and background pixel pairs in a certain frame of image in the video, and provides a new video foreground target extraction solution.

Description

一种视频前景目标提取方法及装置Video foreground target extraction method and device 技术领域Technical field
本公开属于图像处理领域,特别涉及一种视频前景目标提取方法及装置。The present disclosure belongs to the field of image processing, and particularly relates to a method and a device for extracting video foreground objects.
背景技术Background technique
在图像领域的现有技术中,对视频中的某一帧图像,通过选择颜色范围来生成透明度遮罩,然后利用透明度遮罩提取视频的前景目标。In the prior art in the image field, for a certain frame image in a video, a transparency mask is generated by selecting a color range, and then the foreground target of the video is extracted by using the transparency mask.
然而,现有技术中,虽然存在足够多的提取视频前景目标的方案,但是关于如何利用前景背景像素对和灰度信息来提取视频前景目标,尚未有相关新颖的实现方法。However, in the prior art, although there are enough solutions for extracting video foreground targets, there is no related and novel implementation method for how to use foreground background pixel pairs and gray level information to extract video foreground targets.
发明内容Summary of the Invention
本公开提供了一种视频前景目标提取方法,包括如下步骤:The present disclosure provides a video foreground target extraction method, including the following steps:
S100,对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;S100. For a first image in a video, divide all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is a certain image extracted from the video. Frame image
S200,给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
Figure PCTCN2019088278-appb-000001
S200. Given some foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000001
Figure PCTCN2019088278-appb-000002
Figure PCTCN2019088278-appb-000002
其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k 最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
S300,对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
Figure PCTCN2019088278-appb-000003
根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
S300. For each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
Figure PCTCN2019088278-appb-000003
The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
Figure PCTCN2019088278-appb-000004
Figure PCTCN2019088278-appb-000004
其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX); Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
S400,根据如下公式计算每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000005
S400. Calculate the transparency value of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000005
Figure PCTCN2019088278-appb-000006
Figure PCTCN2019088278-appb-000006
S500,根据所述每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000007
初步确定所述第一图像的第一透明度遮罩;
S500. According to the transparency estimation value of each unknown pixel Zk
Figure PCTCN2019088278-appb-000007
Determine a first transparency mask of the first image initially;
S600,对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;S600: Superimpose grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
S700,针对所述第二图像,执行步骤S200至S500,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第一图像的第二透明度遮罩;S700. For the second image, perform steps S200 to S500 to determine a first transparency mask of the second image, and use the first transparency mask of the second image as a second transparency mask of the first image. ;
S800,利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;S800. Use the second transparency mask of the first image to modify the first transparency mask of the first image.
S900,根据步骤S800修正所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。S900. Extract the foreground target in the first image of the video according to the first transparency mask of the first image obtained by the correction in step S800.
此外,本公开还揭示了一种视频前景目标提取装置,包括:In addition, the present disclosure also discloses a video foreground target extraction device, including:
第一划分模块,用于:对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;A first dividing module, configured to divide, for a first image in a video, all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is from the A frame of image extracted from the video;
第一度量模块,用于:给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
Figure PCTCN2019088278-appb-000008
A first metric module, configured to: given certain foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000008
Figure PCTCN2019088278-appb-000009
Figure PCTCN2019088278-appb-000009
其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
第二度量模块,用于:对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
Figure PCTCN2019088278-appb-000010
根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
A second metric module, configured to: for each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
Figure PCTCN2019088278-appb-000010
The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
Figure PCTCN2019088278-appb-000011
Figure PCTCN2019088278-appb-000011
其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX); Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
计算模块,用于:根据如下公式计算每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000012
A calculation module for calculating an estimated transparency value of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000012
Figure PCTCN2019088278-appb-000013
Figure PCTCN2019088278-appb-000013
确定模块,用于:根据所述每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000014
初步确定所述第一图像的第一透明度遮罩;
A determining module, configured to: according to the transparency estimation value of each unknown pixel Z k
Figure PCTCN2019088278-appb-000014
Determine a first transparency mask of the first image initially;
第二划分模块,用于:对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;A second division module, configured to superpose the grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
再次调用模块,用于:针对所述第二图像,再次调用所述第一度量模块、第二度量模块、计算模块和确定模块,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第一图像的第二透明度遮罩;Recalling a module, for: for the second image, calling the first measurement module, the second measurement module, the calculation module, and the determination module again to determine the first transparency mask of the second image, and The first transparency mask of the second image is used as the second transparency mask of the first image;
修正模块,用于:利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;A correction module, configured to: use the second transparency mask of the first image to modify the first transparency mask of the first image;
提取模块,用于:根据修正模块所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。An extraction module is configured to extract a foreground object in the first image of the video according to a first transparency mask of the first image obtained by the correction module.
通过所述方法及装置,本公开能够综合利用前景背景像素对的可信度和灰度信息,提供一种新的视频前景目标提取的方案。Through the method and device, the present disclosure can comprehensively utilize the credibility and gray level information of the foreground and background pixel pairs, and provide a new video foreground target extraction scheme.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本公开中一个实施例所述方法的示意图;FIG. 1 is a schematic diagram of a method according to an embodiment of the present disclosure;
图2是本公开中另一个实施例所述装置的示意图。FIG. 2 is a schematic diagram of a device according to another embodiment of the present disclosure.
具体实施方式detailed description
为了使本领域技术人员理解本公开所披露的技术方案,下面将结合实施例及有关附图,对各个实施例的技术方案进行描述,所描述的实施例是本公开的一部分实施例,而不是全部的实施例。本公开所采 用的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,“包括”和“具有”以及它们的任何变形,意图在于覆盖且不排他的包含。例如包含了一系列步骤或单元的过程、或方法、或系统、或产品或设备没有限定于已列出的步骤或单元,而是可选的还包括没有列出的步骤或单元,或可选的还包括对于这些过程、方法、系统、产品或设备固有的其他步骤或单元。In order to enable those skilled in the art to understand the technical solutions disclosed in the present disclosure, the technical solutions of the various embodiments will be described below in conjunction with the embodiments and related drawings. The described embodiments are part of the embodiments of the present disclosure, rather than All examples. The terms "first", "second", and the like used in this disclosure are used to distinguish different objects, rather than to describe a specific order. In addition, "including" and "having", as well as any variations thereof, are intended to cover and not include exclusively. For example, a process, method, or system, or product or device containing a series of steps or units is not limited to the listed steps or units, but optionally includes steps or units not listed, or optional It also includes other steps or units inherent to these processes, methods, systems, products, or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本公开的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其他实施例互斥的独立的或备选的实施例。本领域技术人员可以理解的是,本文所描述的实施例可以与其他实施例相结合。Reference to "an embodiment" herein means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present disclosure. The appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are they independent or alternative embodiments that are mutually exclusive with other embodiments. Those skilled in the art can understand that the embodiments described herein may be combined with other embodiments.
参见图1,图1是本公开中一个实施例提供的一种视频前景目标提取方法的流程示意图。如图所示,所述方法包括如下步骤:Referring to FIG. 1, FIG. 1 is a schematic flowchart of a video foreground target extraction method according to an embodiment of the present disclosure. As shown, the method includes the following steps:
S100,对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;S100. For a first image in a video, divide all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is a certain image extracted from the video. Frame image
能够理解,对图像划分前景像素、背景像素以及未知像素的手段很多,可以是人工标注,还可以通过机器学习或数据驱动的方式,还可以是根据相应的前景阈值、背景阈值来划分出所有前景和背景像素及其对应的集合;一旦前景和背景像素被划分出来,未知像素、其对应集合也就自然被划分出来;It can be understood that there are many ways to divide the image into foreground pixels, background pixels, and unknown pixels. It can be manually labeled, it can also be machine-learned or data-driven, or it can be divided into all foregrounds according to the corresponding foreground threshold and background threshold And background pixels and their corresponding sets; once the foreground and background pixels are divided, the unknown pixels and their corresponding sets are naturally divided;
此外,当对视频前景目标进行提取时,所述第一图像可以是:当 视频被播放时,响应于用户的操作,对当前视频播放进行暂停,并对暂停画面即时进行当前帧的截取,以便获得第一图像;所述第一图像也可以是:当视频并没有被播放时,响应于用户的操作,随机挑选视频中的某一帧或某几帧,以其中某一帧图像作为第一图像。不管怎样,能够理解,该方法可以用于视频中的每一帧图像的前景目标提取。优选的,所述第一图像是视频中第一帧图像。In addition, when the video foreground object is extracted, the first image may be: when the video is played, in response to a user operation, pausing the current video playback, and immediately intercepting the current frame of the paused picture so that Obtain a first image; the first image may also be: when a video is not played, in response to a user operation, randomly select a certain frame or frames in the video, and use a certain frame image as the first image. Anyway, it can be understood that this method can be used for foreground target extraction of each frame of image in the video. Preferably, the first image is a first frame image in a video.
S200,给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
Figure PCTCN2019088278-appb-000015
S200. Given some foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000015
Figure PCTCN2019088278-appb-000016
Figure PCTCN2019088278-appb-000016
其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
对于本领域技术人员而言,理论上,m的选取,可以使得对应的前景背景像素对是部分样本,也可以穷尽整个图像;就步骤S200而言,其意在通过未知像素的颜色和前景背景像素对的颜色关系来估计未知像素的透明度;另外,m的选取也可以进一步结合邻域像素与未知像素之间在颜色、纹理、灰度、亮度、空间距离等方面的特征;For those skilled in the art, theoretically, the selection of m can make the corresponding foreground and background pixel pairs be partial samples or exhaust the entire image; as for step S200, it is intended to pass the color of the unknown pixels and the foreground background The color relationship of pixel pairs is used to estimate the transparency of unknown pixels. In addition, the selection of m can further combine the characteristics of neighbor pixels and unknown pixels in terms of color, texture, grayscale, brightness, and spatial distance;
S300,对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
Figure PCTCN2019088278-appb-000017
根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
S300. For each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
Figure PCTCN2019088278-appb-000017
The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
Figure PCTCN2019088278-appb-000018
Figure PCTCN2019088278-appb-000018
其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX); Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
能够理解,σ的取值为经验值或统计值或仿真值,步骤S300利用可信度进一步筛选前景背景像素对,并用于后续步骤通过进一步筛选的前景背景像素对来估计未知像素透明度;It can be understood that the value of σ is an empirical value or a statistical value or a simulation value. Step S300 uses the credibility to further filter the foreground and background pixel pairs, and is used in the subsequent steps to estimate the unknown pixel transparency by further filtering the foreground and background pixel pairs;
S400,根据如下公式计算每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000019
S400. Calculate the transparency value of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000019
Figure PCTCN2019088278-appb-000020
Figure PCTCN2019088278-appb-000020
S500,根据所述每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000021
初步确定所述第一图像的第一透明度遮罩;
S500. According to the transparency estimation value of each unknown pixel Zk
Figure PCTCN2019088278-appb-000021
Determine a first transparency mask of the first image initially;
这就是说,当每个未知像素的透明度估计值获得之后,本实施例就自然初步确定了所述第一图像的第一透明度遮罩;之所以说是自然的,是因为透明度遮罩可以视为由
Figure PCTCN2019088278-appb-000022
按一定取值(或取值范围)所选择的那些对应像素组成的;
That is to say, after the transparency value of each unknown pixel is obtained, this embodiment naturally determines the first transparency mask of the first image naturally; the reason why it is natural is that the transparency mask can be viewed On the grounds
Figure PCTCN2019088278-appb-000022
Those corresponding pixels selected according to a certain value (or value range);
S600,对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;S600: Superimpose grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
就该步骤而言,本实施例考虑到每个像素除RGB颜色的作用之外,应当考虑灰度信息对像素的影响;因此,叠加灰度信息后,利用如下步骤对透明度遮罩进行修正。As far as this step is concerned, this embodiment considers that, in addition to the role of RGB color, each pixel should consider the effect of gray information on the pixel; therefore, after superimposing the gray information, use the following steps to modify the transparency mask.
S700,针对所述第二图像,执行步骤S200至S500,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第 一图像的第二透明度遮罩;S700. For the second image, perform steps S200 to S500 to determine a first transparency mask of the second image, and use the first transparency mask of the second image as a second transparency mask of the first image. ;
S800,利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;S800. Use the second transparency mask of the first image to modify the first transparency mask of the first image.
S900,根据步骤S800修正所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。S900. Extract the foreground target in the first image of the video according to the first transparency mask of the first image obtained by the correction in step S800.
至此,本公开综合利用前景背景像素对的可信度和灰度信息,提供一种新的视频前景目标提取的方案。能够理解,视频前景目标的提取,是一个无限逼近的过程,由于视频的图像画面中颜色、灰度的过渡,因此很难说某种方法所获得的透明度遮罩是唯一正确的。理论上,上述实施例融合更多信息、考虑更多因素,有利于更加全面的对视频中的图像进行考察,从而提取出相对满意的视频前景目标。能够理解,在上述实施例中,当根据所述第一透明度遮罩对所述视频的第一图像中的前景目标进行提取时,也可以借鉴、综合现有技术中的有关手段。也就是说,上述实施例的关键在于如何以新的方式获得透明度遮罩,而不在于如何根据透明度遮罩提取视频前景目标。So far, the present disclosure provides a new video foreground target extraction scheme by comprehensively using the reliability and gray level information of the foreground and background pixel pairs. It can be understood that the extraction of video foreground objects is an infinite approximation process. Due to the transition of color and gray in the video image frame, it is difficult to say that the transparency mask obtained by some method is the only correct one. Theoretically, the above-mentioned embodiment integrates more information and considers more factors, which is helpful for a more comprehensive examination of the images in the video, thereby extracting a relatively satisfactory video foreground target. It can be understood that, in the foregoing embodiment, when the foreground target in the first image of the video is extracted according to the first transparency mask, related methods in the prior art may also be used for reference and synthesis. That is, the key of the above embodiment is how to obtain the transparency mask in a new way, and not how to extract the video foreground target according to the transparency mask.
在另一个实施例中,所述步骤S900之后还包括如下步骤:In another embodiment, after step S900, the method further includes the following steps:
S1000,从所述视频中提取剩余的每一帧图像,并分别将其作为所述第一图像,重复执行前述步骤S100至S900,以提取所述视频的所有前景目标;或者S1000. Extract each remaining frame image from the video and use it as the first image, and repeatedly perform the foregoing steps S100 to S900 to extract all foreground objects of the video; or
S1100,从所述视频中提取剩余的每一帧图像,分别将其作为所述第一图像,并根据上一帧修正后的第一图像的第一透明度遮罩划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素 集合B C和所有未知像素集合Z C,重复执行前述步骤S200至S900,以提取所述视频的所有前景目标,其中,划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C具体包括以下步骤: S1100: Extract each remaining frame image from the video, use it as the first image, and divide the first transparency image corresponding to the current frame according to the first transparency mask of the first image modified in the previous frame. All foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in an image are repeatedly performed in steps S200 to S900 to extract all foreground targets of the video, where the current frame is divided All the foreground pixel sets F c , all the background pixel sets B C and all the unknown pixel sets Z C in the corresponding first image specifically include the following steps:
S11001,将上一帧修正后的第一图像的第一透明度遮罩进行二值化,阈值取0.5,获得前景目标的第一二值图像;S11001: Binarize the first transparency mask of the first image after the correction of the previous frame, and take a threshold value of 0.5 to obtain a first binary image of the foreground target;
S11002,将第一二值图像作为第二二值图像初始值;S11002. Use the first binary image as the initial value of the second binary image.
S11003,使用大小为3x3的圆形结构元素对第二二值图像进行形态学腐蚀操作,并用获得的结果更新第二二值图像:S11003: Perform a morphological erosion operation on the second binary image using a circular structural element with a size of 3x3, and update the second binary image with the obtained result:
S11004,重复步骤S1003五次;S11004, repeat step S1003 five times;
S11005,将第一二值图像作为第三二值图像初始值;S11005. Use the first binary image as the initial value of the third binary image.
S11006,使用大小为3x3的圆形结构元素对第三二值图像进行形态学膨胀操作,并用获得的结果更新第三二值图像:S11006: Perform a morphological expansion operation on the third binary image using a circular structural element with a size of 3x3, and update the third binary image with the obtained result:
S11007,重复步骤S1006五次;S11007, repeat step S1006 five times;
S11008,将第二二值图像中为真的对应像素作为所有前景像素集合F c、将第三二值图像中为假的对应像素作为所有背景像素集合B C、其余像素作为所有未知像素集合Z CS11008, the true corresponding pixels in the second binary image are used as all foreground pixel sets F c , the false corresponding pixels in the third binary image are used as all background pixel sets B C , and the remaining pixels are used as all unknown pixel sets Z. C.
能够理解,对视频中的每一帧图像,重复执行上述步骤S100至S900,将能够提取视频中的所有前景目标。但是,考虑到视频画面往往每一帧图像与其后一帧图像具有画面内容上的连贯性和相似性,因此,为了能够充分利用这种连贯性和相似性,上述实施例也可以根据上一帧修正后的第一图像的第一透明度遮罩来划分当前帧所对应的 第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C,从而能够在图像处理的精度和效率之间取得平衡;也就是说,该实施例具备了继承的特性,其:继承了前一帧的透明度遮罩,并利用该透明度遮罩划分后一帧的前景像素集合、背景像素集合以及未知像素集合,鉴于画面内容上的连贯性和相似性,因此这种划分不仅依据了前一帧的透明度遮罩而且利用了形态学腐蚀和形态学膨胀的手段,这属于本公开的一个创新点。 It can be understood that repeating the above steps S100 to S900 for each frame image in the video will extract all foreground targets in the video. However, considering that video frames often have continuity and similarity in the picture content between each frame of the image and its subsequent frame, in order to make full use of this coherence and similarity, the above embodiment may also be based on the previous frame. The first transparency mask of the modified first image divides all foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in the first image corresponding to the current frame, so that it can be used in image processing Strike a balance between accuracy and efficiency; that is, this embodiment has the inherited characteristics: it inherits the transparency mask of the previous frame, and uses the transparency mask to divide the foreground pixel set and background of the next frame The pixel set and the unknown pixel set. In view of the continuity and similarity in the picture content, this division not only relies on the transparency mask of the previous frame but also uses the means of morphological erosion and morphological expansion, which belong to the disclosure An innovation point.
在另一个实施例中,步骤S600中,通过如下方式对第一图像叠加灰度信息以生成第二图像:In another embodiment, in step S600, the grayscale information is superimposed on the first image to generate a second image in the following manner:
S601,对第一图像进行均值滤波得到第三图像;S601. Perform a mean filtering on the first image to obtain a third image.
S602,所述第一图像和第三图像通过如下公式生成第二图像:S602. The first image and the third image generate a second image by using the following formula:
Figure PCTCN2019088278-appb-000023
Figure PCTCN2019088278-appb-000023
其中,IM 2表示叠加后第二图像上第k个像素的灰度值,x r表示第一图像上第k个像素x k的邻域像素,N k表示以x k为中心的邻域内的像素个数,
Figure PCTCN2019088278-appb-000024
表示对第一图像进行均值滤波所得的第三图像上第k个像素的像素值,β取0.5。
Among them, IM 2 represents the gray value of the k-th pixel on the second image after superimposition, x r represents a neighborhood pixel of the k-th pixel x k on the first image, and N k represents the neighborhood of the neighborhood centered on x k Number of pixels,
Figure PCTCN2019088278-appb-000024
Represents the pixel value of the k-th pixel on the third image obtained by performing average filtering on the first image, and β is taken as 0.5.
对于上述实施例,其通过经验值和有关公式,给出了具体叠加灰度信息的方式。For the above-mentioned embodiment, a specific manner for superimposing grayscale information is given through empirical values and related formulas.
在另一个实施例中,步骤S800还包括:In another embodiment, step S800 further includes:
S801,根据第一图像的第二透明度遮罩和第一图像的第一透明度遮罩,分别寻找其第二透明度遮罩的边缘、第一透明度遮罩的边缘;S801. Find the edges of the second transparency mask and the edges of the first transparency mask respectively according to the second transparency mask of the first image and the first transparency mask of the first image.
S802,获得第二透明度遮罩的边缘的所有像素的位置,和第一透明度遮罩的边缘的所有像素的位置,并判定第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进而确定位置相同的像素Z spS802. Obtain positions of all pixels on the edges of the second transparency mask and positions of all pixels on the edges of the first transparency mask, and determine positions of all pixels on the edges of the second transparency mask and the first transparency mask An area where the positions of all pixels at the edges of the edges are coincident, and then the pixels at the same position Z sp are determined ;
S803,分别查找像素Z sp对应于第一图像的第一透明度遮罩的透明度估计值,和对应于第一图像的第二透明度遮罩的透明度估计值,并以二者的平均值作为像素Z sp修正后的透明度估计值; S803: Find the transparency estimate value of the first transparency mask corresponding to the first image of the pixel Z sp and the transparency estimate value of the second transparency mask corresponding to the first image, and use the average value of the two as the pixel Z. sp revised transparency estimate;
S804,以像素Z sp修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 S804, in order to estimate the transparency of the pixel Z sp modification with the transparency of the first image of the first mask.
就上述实施例而言,其意在寻找、对比两种透明度遮罩中位置相同的像素,并利用所述位置相同的像素在各自透明度遮罩中的透明度估计值,取平均值以修正第一图像的第一透明度遮罩。In the above embodiment, it is intended to find and compare the pixels with the same position in the two transparency masks, and use the transparency estimates of the pixels with the same position in the respective transparency masks to take the average value to modify the first The first transparency mask of the image.
在另一个实施例中,所述步骤S802进一步包括:In another embodiment, the step S802 further includes:
S8021,根据判定的第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进一步确定位置不同的像素Z dp,包括两种情形:位于第二透明度遮罩的边缘的像素Z dp2和位于第一透明度遮罩的边缘的像素Z dp1S8021. According to the determined area where the positions of all pixels of the edges of the second transparency mask and the positions of all pixels of the edges of the first transparency mask coincide, further determine pixels Z dp with different positions, including two cases: A pixel Z dp2 at the edge of the second transparency mask and a pixel Z dp1 at the edge of the first transparency mask;
与前一个实施例不同的是,本实施例额外关注两个透明度遮罩所确定出来的边缘中位置不同的像素,并找出彼此不同位置的这些像素;Different from the previous embodiment, this embodiment additionally pays attention to pixels with different positions in the edges determined by the two transparency masks, and finds these pixels at different positions from each other;
S8022,利用所述位置不同的像素Z dp和位置相同的像素Z sp,获得第二透明度遮罩的边缘与第一透明度遮罩的边缘所确定的:边缘与 边缘之间所封闭的闭合区域,以及所述闭合区域的所有封闭像素的位置; S8022, using the pixels Z dp at different positions and the pixels Z sp at the same position to obtain the edge determined by the edge of the second transparency mask and the edge of the first transparency mask: a closed area enclosed between the edges, And positions of all closed pixels of the closed area;
就该步骤而言,由于每个遮罩所对应的边缘都可以一定程度视为一个连通或闭合的曲线,那么无论两个遮罩所对应的闭合曲线是怎样的重叠或不重叠的关系:对于两个遮罩所对应的边缘上的那些位置不对应(即位置不同,或称位置不重合)的像素而言,共同确定了两个遮罩的边缘与边缘之间所封闭的闭合区域,以及所述闭合区域的所有封闭像素的位置;As far as this step is concerned, since the edge corresponding to each mask can be regarded as a connected or closed curve to a certain extent, no matter how the closed curves corresponding to the two masks overlap or not overlap: For those pixels on the edges corresponding to the two masks that do not correspond (that is, the positions are different, or the positions are not coincident), the closed area enclosed by the edges and edges of the two masks is jointly determined, and Positions of all closed pixels of the closed area;
S8023,执行如下子步骤:S8023, perform the following sub-steps:
(1)查找像素Z dp1的位置所对应的像素于第一图像的第一透明度遮罩的透明度估计值,并查找该对应的像素于第二图像中的透明度值,并以二者的平均值作为像素Z dp1修正后的透明度估计值; (1) Find the transparency value of the pixel corresponding to the position of the pixel Z dp1 in the first transparency mask of the first image, and find the transparency value of the corresponding pixel in the second image, and use the average of the two As the pixel Z dp1 corrected transparency estimate;
(2)查找像素Z dp2的位置所对应的像素于第一图像的第二透明度遮罩的透明度估计值,并查找该对应的像素于第一图像中的透明度值,并以二者的平均值作为像素Z dp2修正后的透明度估计值; (2) Find the transparency value of the pixel corresponding to the position of the pixel Z dp2 in the second transparency mask of the first image, and find the transparency value of the corresponding pixel in the first image, and use the average of the two As the pixel Z dp2 corrected transparency estimate;
对于该步骤而言,其意在寻找前述闭合区域内每个像素在两个不同体系下的透明度估计值或透明度值,并以二者的平均值作为对应像素修正后的透明度估计值,然后在下一步骤S8024中用于修正第一图像的第一透明度遮罩。也就是说,本实施例类似于前一个实施例的修正思路那样,只不过本实施例解决的是两个遮罩对应的边缘所共同封 闭的区域。其中,以像素Z dp1为例,其属于第一图像的第一透明度遮罩的像素,其于第一图像的第一透明度遮罩存在一透明度估计值,另外,于第二图像中,该像素Z dp1位置所对应的第二图像中的像素具备第二图像中的透明度值,本实施例以该透明度估计值和透明度值的平均值作为对应像素Z dp1修正后的透明度估计值。像素Z dp1类似。 For this step, it is intended to find the transparency value or transparency value of each pixel in the two closed systems under the two different systems, and use the average value of the two as the corresponding pixel's modified transparency value. In step S8024, a first transparency mask for correcting the first image is used. That is, this embodiment is similar to the modified idea of the previous embodiment, except that this embodiment solves the area enclosed by the edges corresponding to the two masks. Taking the pixel Z dp1 as an example, it belongs to a pixel of a first transparency mask of a first image, and there is an estimated transparency value in the first transparency mask of the first image. In addition, in a second image, the pixel The pixel in the second image corresponding to the position of Z dp1 has the transparency value in the second image. In this embodiment, the transparency estimation value and the average value of the transparency value are used as the corrected transparency estimation value of the corresponding pixel Z dp1 . The pixels Z dp1 are similar.
S8024,结合像素Z dp1修正后的透明度估计值和像素Z dp2修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。例如,将像素Z dp1修正后的透明度估计值和像素Z dp2修正后的透明度估计值,作为第一透明度遮罩对应位置处像素的透明度值。 S8024. Combine the corrected transparency estimate value of the pixel Z dp1 and the corrected transparency estimate value of the pixel Z dp2 to modify the first transparency mask of the first image. For example, the corrected transparency estimation value of the pixel Z dp1 and the corrected transparency estimation value of the pixel Z dp2 are used as the transparency value of the pixel at the corresponding position of the first transparency mask.
本公开的实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method of the embodiment of the present disclosure can be adjusted, combined, and deleted according to actual needs.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。It should be noted that, for the foregoing method embodiments, for simplicity of description, they are all described as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the described action order. Because according to the present invention, certain steps may be performed in another order or simultaneously.
此外,参见图2,本公开在另一个实施例中还揭示了一种视频前景目标提取装置,包括:In addition, referring to FIG. 2, the present disclosure also discloses a video foreground target extraction device in another embodiment, including:
第一划分模块,用于:对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;A first dividing module, configured to divide, for a first image in a video, all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is from the A frame of image extracted from the video;
第一度量模块,用于:给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
Figure PCTCN2019088278-appb-000025
A first metric module, configured to: given certain foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000025
Figure PCTCN2019088278-appb-000026
Figure PCTCN2019088278-appb-000026
其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
第二度量模块,用于:对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
Figure PCTCN2019088278-appb-000027
根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
A second metric module, configured to: for each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
Figure PCTCN2019088278-appb-000027
The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
Figure PCTCN2019088278-appb-000028
Figure PCTCN2019088278-appb-000028
其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX); Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
计算模块,用于:根据如下公式计算每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000029
A calculation module for calculating an estimated transparency value of each unknown pixel Z k according to the following formula
Figure PCTCN2019088278-appb-000029
Figure PCTCN2019088278-appb-000030
Figure PCTCN2019088278-appb-000030
确定模块,用于:根据所述每个未知像素Z k的透明度估计值
Figure PCTCN2019088278-appb-000031
初步确定所述第一图像的第一透明度遮罩;
A determining module, configured to: according to the transparency estimation value of each unknown pixel Z k
Figure PCTCN2019088278-appb-000031
Determine a first transparency mask of the first image initially;
第二划分模块,用于:对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;A second division module, configured to superpose the grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
再次调用模块,用于:针对所述第二图像,再次调用所述第一度 量模块、第二度量模块、计算模块和确定模块,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第一图像的第二透明度遮罩;Recalling a module, for: for the second image, calling the first measurement module, the second measurement module, the calculation module, and the determination module again to determine the first transparency mask of the second image, and The first transparency mask of the second image is used as the second transparency mask of the first image;
修正模块,用于:利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;A correction module, configured to: use the second transparency mask of the first image to modify the first transparency mask of the first image;
提取模块,用于:根据修正模块所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。An extraction module is configured to extract a foreground object in the first image of the video according to a first transparency mask of the first image obtained by the correction module.
就该实施例而言,正如图2所示,上述各个模块可以与处理器和存储器构成系统以便实施;但是,图2并不妨碍:各个模块也可以自身具备处理单元以实现数据处理能力。In this embodiment, as shown in FIG. 2, the foregoing modules may be combined with a processor and a memory to form a system for implementation; however, FIG. 2 is not a hindrance: each module may also have a processing unit itself to implement data processing capabilities.
在另一个实施例中,所述装置还包括如下模块:In another embodiment, the apparatus further includes the following modules:
依次调用模块,用于:从所述视频中提取剩余的每一帧图像,并分别将其作为所述第一图像,依次调用所述:第一划分模块、第一度量模块、第二度量模块、计算模块、确定模块、第二划分模块、再次调用模块、修正模块和提取模块,以提取所述视频的所有前景目标;或者包括:A module is sequentially called for: extracting each remaining frame image from the video and using it as the first image, respectively, and sequentially calling: the first division module, the first measurement module, and the second measurement A module, a calculation module, a determination module, a second division module, a re-call module, a correction module, and an extraction module to extract all foreground targets of the video; or include:
继承调用模块,用于:从所述视频中提取剩余的每一帧图像,分别将其作为所述第一图像,并输入到第三划分模块,其中,所述第三划分模块用于根据上一帧修正后的第一图像的第一透明度遮罩划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C;然后所述继承调用模块依次调用所述第一度量模块、第二度量模块、计算模块、确定模块、第二划分模 块、再次调用模块、修正模块和提取模块,以提取所述视频的所有前景目标,其中,第三划分模块包括: The inheritance calling module is configured to extract each remaining frame image from the video, use it as the first image, and input it to a third partitioning module, where the third partitioning module is configured according to the above The first transparency mask of the corrected first image of one frame divides all foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in the first image corresponding to the current frame; then, The inheritance calling module sequentially calls the first measurement module, the second measurement module, the calculation module, the determination module, the second division module, the recall module, the correction module, and the extraction module in order to extract all foreground targets of the video, where The third division module includes:
第一二值图像处理单元,用于将上一帧修正后的第一图像的第一透明度遮罩进行二值化,阈值取0.5,获得前景目标的第一二值图像;A first binary image processing unit, configured to binarize the first transparency mask of the first image corrected in the previous frame, and take a threshold of 0.5 to obtain a first binary image of the foreground target;
第二二值图像初始单元,用于:将第一二值图像作为第二二值图像初始值;A second binary image initial unit, configured to: use the first binary image as an initial value of the second binary image;
第二二值图像处理单元,用于:使用大小为3x3的圆形结构元素对第二二值图像进行形态学腐蚀操作,并用获得的结果更新第二二值图像;A second binary image processing unit, configured to: perform a morphological erosion operation on the second binary image by using a circular structural element with a size of 3x3, and update the second binary image with the obtained result;
第一重复调用单元,用于重复调用第二二值处理单元五次;The first repeated calling unit is used to repeatedly call the second binary processing unit five times;
第三二值图像初始单元,用于:将第一二值图像作为第三二值图像初始值;A third binary image initial unit, configured to: use the first binary image as an initial value of the third binary image;
第三二值图像处理单元,用于:使用大小为3x3的圆形结构元素对第三二值图像进行形态学膨胀操作,并用获得的结果更新第三二值图像:The third binary image processing unit is configured to perform a morphological expansion operation on the third binary image using a circular structure element of size 3x3, and update the third binary image with the obtained result:
第一重复调用单元,用于重复调用第三二值处理单元五次;A first repeated calling unit for repeatedly calling the third binary processing unit five times;
真假划分单元,用于:将第二二值图像处理单元最后更新的第二二值图像中为真的对应像素作为所有前景像素集合F c、将第三二值图像处理单元最后更新的第三二值图像中为假的对应像素作为所有背景像素集合B C、其余像素作为所有未知像素集合Z CA true / false division unit, configured to: use the corresponding corresponding pixels in the second binary image last updated by the second binary image processing unit as true to set all foreground pixel sets F c ; Corresponding false pixels in the three binary image are all background pixel sets B C and the remaining pixels are all unknown pixel sets Z C.
在另一个实施例中,其中,第二划分模块还包括:In another embodiment, the second dividing module further includes:
均值滤波单元,用于:对第一图像进行均值滤波得到第三图像;An average filtering unit, configured to: perform average filtering on the first image to obtain a third image;
第二图像生成单元,用于:所述第一图像和第三图像通过如下公式生成第二图像:A second image generating unit is configured to generate the second image by using the following formula:
Figure PCTCN2019088278-appb-000032
Figure PCTCN2019088278-appb-000032
其中,IM 2表示叠加后第二图像上第k个像素的灰度值,x r表示第一图像上第k个像素x k的邻域像素,N k表示以x k为中心的邻域内的像素个数,
Figure PCTCN2019088278-appb-000033
表示对第一图像进行均值滤波所得的第三图像上第k个像素的像素值,β取0.5。
Among them, IM 2 represents the gray value of the k-th pixel on the second image after superimposition, x r represents a neighborhood pixel of the k-th pixel x k on the first image, and N k represents the neighborhood of the neighborhood centered on x k Number of pixels,
Figure PCTCN2019088278-appb-000033
Represents the pixel value of the k-th pixel on the third image obtained by performing average filtering on the first image, and β is taken as 0.5.
在另一个实施例中,其中,修正模块还包括:In another embodiment, the correction module further includes:
寻找边缘单元,用于:根据第一图像的第二透明度遮罩和第一图像的第一透明度遮罩,分别寻找其第二透明度遮罩的边缘、第一透明度遮罩的边缘;Finding an edge unit, configured to find the edge of the second transparency mask and the edge of the first transparency mask respectively according to the second transparency mask of the first image and the first transparency mask of the first image;
确定位置单元,用于:获得第二透明度遮罩的边缘的所有像素的位置,和第一透明度遮罩的边缘的所有像素的位置,并判定第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进而确定位置相同的像素Z spA position determining unit, configured to obtain the positions of all pixels at the edges of the second transparency mask and the positions of all pixels at the edges of the first transparency mask, and determine the positions and An area where the positions of all pixels of the edges of the first transparency mask are coincident, and then pixels Z sp having the same position are determined;
第一修正单元,用于:分别查找像素Z sp对应于第一图像的第一透明度遮罩的透明度估计值,和对应于第一图像的第二透明度遮罩的透明度估计值,并以二者的平均值作为像素Z sp修正后的透明度估计 值; The first correction unit is configured to find the transparency estimate value of the first transparency mask corresponding to the first image of the pixel Z sp and the transparency estimate value of the second transparency mask corresponding to the first image, respectively. The average value of the pixel is used as the estimated transparency value of the pixel Z sp ;
第二修正单元,用于:以像素Z sp修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 The second correction unit is configured to correct the first transparency mask of the first image by using the pixel Z sp corrected transparency estimation value.
能够理解,所述装置能够实施前文第一个实施例所述的方法。It can be understood that the device can implement the method described in the first embodiment.
在另一个实施例中,其中,所述确定位置单元进一步包括:In another embodiment, the location determining unit further includes:
不同位置子单元,用于:根据判定的第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进一步确定位置不同的像素Z dp,包括:位于第二透明度遮罩的边缘的像素Z dp2和位于第一透明度遮罩的边缘的像素Z dp1Different position subunits are used to further determine pixels with different positions Z dp according to the positions where all the pixels of the edges of the second transparency mask and the positions of all the pixels of the edges of the first transparency mask overlap, including: : A pixel Z dp2 located at the edge of the second transparency mask and a pixel Z dp1 located at the edge of the first transparency mask;
闭合子单元,用于:利用所述位置不同的像素Z dp和位置相同的像素Z sp,获得第二透明度遮罩的边缘与第一透明度遮罩的边缘所确定的:边缘与边缘之间所封闭的闭合区域,以及所述闭合区域的所有封闭像素的位置; A closed subunit, configured to use the pixels Z dp at different positions and the pixels Z sp at the same position to obtain the edge determined by the second transparency mask and the edge of the first transparency mask: A closed closed area, and the positions of all closed pixels of the closed area;
多次查找子单元,用于:Find subunits multiple times for:
(3)查找像素Z dp1的位置所对应的像素于第一图像的第一透明度遮罩的透明度估计值,并查找该对应的像素于第二图像中的透明度值,并以二者的平均值作为像素Z dp1修正后的透明度估计值; (3) Find the transparency value of the pixel corresponding to the position of the pixel Z dp1 in the first transparency mask of the first image, and find the transparency value of the corresponding pixel in the second image, and use the average of the two As the pixel Z dp1 corrected transparency estimate;
(4)查找像素Z dp2的位置所对应的像素于第一图像的第二透明度遮罩的透明度估计值,并查找该对应的像素于第一图像中第一透明度遮罩的透明度值,并以二者的平均值作为像素Z dp2修正后的透明度估计值; (4) Find the transparency value of the pixel corresponding to the position of the pixel Z dp2 in the second transparency mask of the first image, and find the transparency value of the corresponding pixel in the first transparency mask in the first image, and use The average of the two is used as the pixel Z dp2 corrected transparency estimate;
复杂修正子单元,用于:结合像素Z dp1修正后的透明度估计值和像素Z dp2修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 The complex correction subunit is configured to: modify the first transparency mask of the first image in combination with the pixel Z dp1 modified transparency estimation value and the pixel Z dp2 modified transparency estimation value.
本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作、模块、单元并不一定是本发明所必须的。Those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions, modules, and units involved are not necessarily required by the present invention.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For a part that is not described in detail in one embodiment, reference may be made to related descriptions in other embodiments.
在本公开所提供的几个实施例中,应该理解到,所揭露的方法,可实现为对应的功能单元、处理器乃至系统,其中所述系统的各部分既可以位于一个地方,也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。另外,各功能单元可以集成在一个处理单元中,也可以是各个单元单独存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为智能手机、个人数字助理、可穿戴设备、笔记本电脑、平板电脑)执行本公开的各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、 随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。In the several embodiments provided by the present disclosure, it should be understood that the disclosed method may be implemented as a corresponding functional unit, processor, or even a system, where each part of the system may be located in one place or distributed. To multiple network elements. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment. In addition, each functional unit may be integrated into one processing unit, or each unit may exist alone, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit. When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on such an understanding, the technical solution of the present disclosure essentially or part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium Including instructions for causing a computer device (which may be a smart phone, a personal digital assistant, a wearable device, a notebook computer, or a tablet computer) to perform all or part of the steps of the method described in various embodiments of the present disclosure. The aforementioned storage media include: U disks, Read-Only Memory (ROM), Random Access Memory (RAM), mobile hard disks, magnetic disks, or optical disks, and other media that can store program codes .
以上所述,以上实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述实施例对本公开进行了详细的说明,本领域技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开的各实施例技术方案的范围。As mentioned above, the above embodiments are only used to describe the technical solutions of the present disclosure, but are not limited thereto. Although the present disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions outside the scope of the technical solutions of the embodiments of the present disclosure.

Claims (10)

  1. 一种视频前景目标提取方法,包括如下步骤:A video foreground target extraction method includes the following steps:
    S100,对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;S100. For a first image in a video, divide all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is a certain image extracted from the video. Frame image
    S200,给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
    Figure PCTCN2019088278-appb-100001
    S200. Given some foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
    Figure PCTCN2019088278-appb-100001
    Figure PCTCN2019088278-appb-100002
    Figure PCTCN2019088278-appb-100002
    其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
    S300,对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
    Figure PCTCN2019088278-appb-100003
    根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
    S300. For each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
    Figure PCTCN2019088278-appb-100003
    The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
    Figure PCTCN2019088278-appb-100004
    Figure PCTCN2019088278-appb-100004
    其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX);; Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
    S400,根据如下公式计算每个未知像素Z k的透明度估计值
    Figure PCTCN2019088278-appb-100005
    S400. Calculate the transparency value of each unknown pixel Z k according to the following formula
    Figure PCTCN2019088278-appb-100005
    Figure PCTCN2019088278-appb-100006
    Figure PCTCN2019088278-appb-100006
    S500,根据所述每个未知像素Z k的透明度估计值
    Figure PCTCN2019088278-appb-100007
    初步确定所述第一图像的第一透明度遮罩;
    S500. According to the transparency estimation value of each unknown pixel Zk
    Figure PCTCN2019088278-appb-100007
    Determine a first transparency mask of the first image initially;
    S600,对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;S600: Superimpose grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
    S700,针对所述第二图像,执行步骤S200至S500,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第一图像的第二透明度遮罩;S700. For the second image, perform steps S200 to S500 to determine a first transparency mask of the second image, and use the first transparency mask of the second image as a second transparency mask of the first image. ;
    S800,利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;S800. Use the second transparency mask of the first image to modify the first transparency mask of the first image.
    S900,根据步骤S800修正所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。S900. Extract the foreground target in the first image of the video according to the first transparency mask of the first image obtained by the correction in step S800.
  2. 根据权利要求1所述的方法,其中,所述步骤S900之后还包括如下步骤:The method according to claim 1, further comprising the following steps after step S900:
    S1000,从所述视频中提取剩余的每一帧图像,并分别将其作为所述第一图像,重复执行前述步骤S100至S900,以提取所述视频的所有前景目标;或者S1000. Extract each remaining frame image from the video and use it as the first image, and repeatedly perform the foregoing steps S100 to S900 to extract all foreground objects of the video; or
    S1100,从所述视频中提取剩余的每一帧图像,分别将其作为所述第一图像,并根据上一帧修正后的第一图像的第一透明度遮罩划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C,重复执行前述步骤S200至S900,以提取所述视频的所有前景目标,其中,划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C具体包括以下步骤: S1100: Extract each remaining frame image from the video, use it as the first image, and divide the first transparency image corresponding to the current frame according to the first transparency mask of the first image modified in the previous frame. All foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in an image are repeatedly performed in steps S200 to S900 to extract all foreground targets of the video, where the current frame is divided All the foreground pixel sets F c , all the background pixel sets B C and all the unknown pixel sets Z C in the corresponding first image specifically include the following steps:
    S11001,将上一帧修正后的第一图像的第一透明度遮罩进行二值化,阈值取0.5,获得前景目标的第一二值图像;S11001: Binarize the first transparency mask of the first image after the correction of the previous frame, and take a threshold value of 0.5 to obtain a first binary image of the foreground target;
    S11002,将第一二值图像作为第二二值图像初始值;S11002. Use the first binary image as the initial value of the second binary image.
    S11003,使用大小为3x3的圆形结构元素对第二二值图像进行形态学腐蚀操作,并用获得的结果更新第二二值图像:S11003: Perform a morphological erosion operation on the second binary image using a circular structural element with a size of 3x3, and update the second binary image with the obtained result:
    S11004,重复步骤S1003五次;S11004, repeat step S1003 five times;
    S11005,将第一二值图像作为第三二值图像初始值;S11005. Use the first binary image as the initial value of the third binary image.
    S11006,使用大小为3x3的圆形结构元素对第三二值图像进行形态学膨胀操作,并用获得的结果更新第三二值图像:S11006: Perform a morphological expansion operation on the third binary image using a circular structural element with a size of 3x3, and update the third binary image with the obtained result:
    S11007,重复步骤S1006五次;S11007, repeat step S1006 five times;
    S11008,将第二二值图像中为真的对应像素作为所有前景像素集合F c、将第三二值图像中为假的对应像素作为所有背景像素集合B C、其余像素作为所有未知像素集合Z CS11008, the true corresponding pixels in the second binary image are used as all foreground pixel sets F c , the false corresponding pixels in the third binary image are used as all background pixel sets B C , and the remaining pixels are used as all unknown pixel sets Z. C.
  3. 根据权利要求1所述的方法,其中,步骤S600中,通过如下方式对第一图像叠加灰度信息以生成第二图像:The method according to claim 1, wherein, in step S600, the grayscale information is superimposed on the first image to generate a second image in the following manner:
    S601,对第一图像进行均值滤波得到第三图像;S601. Perform a mean filtering on the first image to obtain a third image.
    S602,所述第一图像和第三图像通过如下公式生成第二图像:S602. The first image and the third image generate a second image by using the following formula:
    Figure PCTCN2019088278-appb-100008
    Figure PCTCN2019088278-appb-100008
    其中,IM 2表示叠加后第二图像上第k个像素的灰度值,x r表示第一图像上第k个像素x k的邻域像素,N k表示以x k为中心的邻域内的 像素个数,
    Figure PCTCN2019088278-appb-100009
    表示对第一图像进行均值滤波所得的第三图像上第k个像素的像素值,β取0.5。
    Among them, IM 2 represents the gray value of the k-th pixel on the second image after superimposition, x r represents a neighborhood pixel of the k-th pixel x k on the first image, and N k represents the neighborhood of the neighborhood centered on x k Number of pixels,
    Figure PCTCN2019088278-appb-100009
    Represents the pixel value of the k-th pixel on the third image obtained by performing average filtering on the first image, and β is taken as 0.5.
  4. 根据权利要求1所述的方法,其中,步骤S800还包括:The method according to claim 1, wherein step S800 further comprises:
    S801,根据第一图像的第二透明度遮罩和第一图像的第一透明度遮罩,分别寻找其第二透明度遮罩的边缘、第一透明度遮罩的边缘;S801. Find the edges of the second transparency mask and the edges of the first transparency mask respectively according to the second transparency mask of the first image and the first transparency mask of the first image.
    S802,获得第二透明度遮罩的边缘的所有像素的位置,和第一透明度遮罩的边缘的所有像素的位置,并判定第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进而确定位置相同的像素Z spS802. Obtain the positions of all pixels at the edges of the second transparency mask and the positions of all pixels at the edges of the first transparency mask, and determine the positions of all pixels at the edges of the second transparency mask and the first transparency mask. An area where the positions of all pixels at the edges of the edges are coincident, and then the pixels at the same position Z sp are determined ;
    S803,分别查找像素Z sp对应于第一图像的第一透明度遮罩的透明度估计值,和对应于第一图像的第二透明度遮罩的透明度估计值,并以二者的平均值作为像素Z sp修正后的透明度估计值; S803: Find the transparency estimate value of the first transparency mask corresponding to the first image of the pixel Z sp and the transparency estimate value of the second transparency mask corresponding to the first image, and use the average value of the two as the pixel Z. sp revised transparency estimate;
    S804,以像素Z sp修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 S804, in order to estimate the transparency of the pixel Z sp modification with the transparency of the first image of the first mask.
  5. 根据权利要求4所述的方法,其中,所述步骤S802进一步包括:The method according to claim 4, wherein the step S802 further comprises:
    S8021,根据判定的第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进一步确定位置不同的像素Z dp,包括:位于第二透明度遮罩的边缘的像素Z dp2和位于第一透明度遮罩的边缘的像素Z dp1S8021. According to the determined area where the positions of all pixels of the edge of the second transparency mask and the positions of all pixels of the edge of the first transparency mask coincide, further determining pixels Z dp having different positions, including: located at the second transparency mask Pixels Z dp2 at the edges of the mask and pixels Z dp1 at the edges of the first transparency mask;
    S8022,利用所述位置不同的像素Z dp和位置相同的像素Z sp,获得第二透明度遮罩的边缘与第一透明度遮罩的边缘所确定的:边缘与 边缘之间所封闭的闭合区域,以及所述闭合区域的所有封闭像素的位置; S8022, using the pixels Z dp at different positions and the pixels Z sp at the same position to obtain the edge determined by the edge of the second transparency mask and the edge of the first transparency mask: a closed area enclosed between the edges, And positions of all closed pixels of the closed area;
    S8023,执行如下子步骤:S8023, perform the following sub-steps:
    (1)查找像素Z dp1的位置所对应的像素于第一图像的第一透明度遮罩的透明度估计值,并查找该对应的像素于第二图像中的透明度值,并以二者的平均值作为像素Z dp1修正后的透明度估计值; (1) Find the transparency value of the pixel corresponding to the position of the pixel Z dp1 in the first transparency mask of the first image, and find the transparency value of the corresponding pixel in the second image, and use the average of the two As the pixel Z dp1 corrected transparency estimate;
    (2)查找像素Z dp2的位置所对应的像素于第一图像的第二透明度遮罩的透明度估计值,并查找该对应的像素于第一图像中的透明度值,并以二者的平均值作为像素Z dp2修正后的透明度估计值; (2) Find the transparency value of the pixel corresponding to the position of the pixel Z dp2 in the second transparency mask of the first image, and find the transparency value of the corresponding pixel in the first image, and use the average of the two As the pixel Z dp2 corrected transparency estimate;
    S8024,结合像素Z dp1修正后的透明度估计值和像素Z dp2修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 S8024. Combine the corrected transparency estimate value of the pixel Z dp1 and the corrected transparency estimate value of the pixel Z dp2 to modify the first transparency mask of the first image.
  6. 一种视频前景目标提取装置,包括:A video foreground target extraction device includes:
    第一划分模块,用于:对于视频中的第一图像,划分该图像中的所有前景像素集合F、所有背景像素集合B和所有未知像素集合Z;其中,所述第一图像是从所述视频中提取的某一帧图像;A first dividing module, configured to divide, for a first image in a video, all foreground pixel sets F, all background pixel sets B, and all unknown pixel sets Z in the image; wherein the first image is from the A frame of image extracted from the video;
    第一度量模块,用于:给定某些前景背景像素对(F i,B j),根据如下公式度量每个未知像素Z k的透明度
    Figure PCTCN2019088278-appb-100010
    A first metric module, configured to: given certain foreground and background pixel pairs (F i , B j ), measure the transparency of each unknown pixel Z k according to the following formula
    Figure PCTCN2019088278-appb-100010
    Figure PCTCN2019088278-appb-100011
    Figure PCTCN2019088278-appb-100011
    其中,I k为未知像素Z k的RGB颜色值,所述前景像素F i为距离未 知像素Z k最近的m个前景像素、所述背景像素B j也为距离未知像素Z k最近的m个背景像素,所述前景背景像素对(F i,B j)总计m 2组; Among them, I k is the RGB color value of the unknown pixel Z k , the foreground pixel F i is the m foreground pixels closest to the unknown pixel Z k , and the background pixel B j is also the m closest to the unknown pixel Z k Background pixels, the foreground and background pixel pairs (F i , B j ) totaling m 2 groups;
    第二度量模块,用于:对于所述m 2组中的每一组前景背景像素对(F i,B j)及其对应的
    Figure PCTCN2019088278-appb-100012
    根据如下公式度量前景背景像素对(F i,B j)的可信度n ij
    A second metric module, configured to: for each of the m 2 groups of foreground and background pixel pairs (F i , B j ) and their corresponding
    Figure PCTCN2019088278-appb-100012
    The confidence level n ij of the foreground and background pixel pair (F i , B j ) is measured according to the following formula:
    Figure PCTCN2019088278-appb-100013
    Figure PCTCN2019088278-appb-100013
    其中,σ取值0.1,并选取可信度最高的MAX(n ij)所对应的那一组前景背景像素对为(F iMAX,B jMAX); Among them, σ takes a value of 0.1, and the set of foreground and background pixel pairs corresponding to MAX (n ij ) with the highest reliability is selected as (F iMAX , B jMAX );
    计算模块,用于:根据如下公式计算每个未知像素Z k的透明度估计值
    Figure PCTCN2019088278-appb-100014
    A calculation module for calculating an estimated transparency value of each unknown pixel Z k according to the following formula
    Figure PCTCN2019088278-appb-100014
    Figure PCTCN2019088278-appb-100015
    Figure PCTCN2019088278-appb-100015
    确定模块,用于:根据所述每个未知像素Z k的透明度估计值
    Figure PCTCN2019088278-appb-100016
    初步确定所述第一图像的第一透明度遮罩;
    A determining module, configured to: according to the transparency estimation value of each unknown pixel Z k
    Figure PCTCN2019088278-appb-100016
    Determine a first transparency mask of the first image initially;
    第二划分模块,用于:对第一图像叠加灰度信息以生成第二图像,并对所述第二图像划分其所有前景像素集合、所有背景像素集合和所有未知像素集合;A second division module, configured to superpose the grayscale information on the first image to generate a second image, and divide the second image into all its foreground pixel sets, all background pixel sets, and all unknown pixel sets;
    再次调用模块,用于:针对所述第二图像,再次调用所述第一度量模块、第二度量模块、计算模块和确定模块,以确定第二图像的第一透明度遮罩,并将所述第二图像的第一透明度遮罩作为第一图像的第二透明度遮罩;Recalling a module, for: for the second image, calling the first measurement module, the second measurement module, the calculation module, and the determination module again to determine the first transparency mask of the second image, and The first transparency mask of the second image is used as the second transparency mask of the first image;
    修正模块,用于:利用所述第一图像的第二透明度遮罩,修正所述第一图像的第一透明度遮罩;A correction module, configured to: use the second transparency mask of the first image to modify the first transparency mask of the first image;
    提取模块,用于:根据修正模块所得的第一图像的第一透明度遮罩,对所述视频的第一图像中的前景目标进行提取。An extraction module is configured to extract a foreground object in the first image of the video according to a first transparency mask of the first image obtained by the correction module.
  7. 根据权利要求6所述的装置,所述装置还包括:The apparatus according to claim 6, further comprising:
    依次调用模块,用于:从所述视频中提取剩余的每一帧图像,并分别将其作为所述第一图像,依次调用所述:第一划分模块、第一度量模块、第二度量模块、计算模块、确定模块、第二划分模块、再次调用模块、修正模块和提取模块,以提取所述视频的所有前景目标;或者包括:A module is sequentially called for: extracting each remaining frame image from the video and using it as the first image, respectively, and sequentially calling: the first division module, the first measurement module, and the second measurement A module, a calculation module, a determination module, a second division module, a re-call module, a correction module, and an extraction module to extract all foreground targets of the video; or include:
    继承调用模块,用于:从所述视频中提取剩余的每一帧图像,分别将其作为所述第一图像,并输入到第三划分模块,其中,所述第三划分模块用于根据上一帧修正后的第一图像的第一透明度遮罩划分该当前帧所对应的第一图像中的所有前景像素集合F c、所有背景像素集合B C和所有未知像素集合Z C;然后所述继承调用模块依次调用所述第一度量模块、第二度量模块、计算模块、确定模块、第二划分模块、再次调用模块、修正模块和提取模块,以提取所述视频的所有前景目标,其中,第三划分模块包括: The inheritance calling module is configured to extract each remaining frame image from the video, use it as the first image, and input it to a third partitioning module, where the third partitioning module is configured according to the above The first transparency mask of the corrected first image of one frame divides all foreground pixel sets F c , all background pixel sets B C and all unknown pixel sets Z C in the first image corresponding to the current frame; then, The inheritance calling module sequentially calls the first measurement module, the second measurement module, the calculation module, the determination module, the second division module, the recall module, the correction module, and the extraction module in order to extract all foreground targets of the video, where The third division module includes:
    第一二值图像处理单元,用于将上一帧修正后的第一图像的第一透明度遮罩进行二值化,阈值取0.5,获得前景目标的第一二值图像;A first binary image processing unit, configured to binarize the first transparency mask of the first image corrected in the previous frame, and take a threshold of 0.5 to obtain a first binary image of the foreground target;
    第二二值图像初始单元,用于:将第一二值图像作为第二二 值图像初始值;A second binary image initial unit, configured to: use the first binary image as an initial value of the second binary image;
    第二二值图像处理单元,用于:使用大小为3x3的圆形结构元素对第二二值图像进行形态学腐蚀操作,并用获得的结果更新第二二值图像;A second binary image processing unit, configured to: perform a morphological erosion operation on the second binary image by using a circular structural element with a size of 3x3, and update the second binary image with the obtained result;
    第一重复调用单元,用于重复调用第二二值处理单元五次;The first repeated calling unit is used to repeatedly call the second binary processing unit five times;
    第三二值图像初始单元,用于:将第一二值图像作为第三二值图像初始值;A third binary image initial unit, configured to: use the first binary image as an initial value of the third binary image;
    第三二值图像处理单元,用于:使用大小为3x3的圆形结构元素对第三二值图像进行形态学膨胀操作,并用获得的结果更新第三二值图像:The third binary image processing unit is configured to perform a morphological expansion operation on the third binary image using a circular structure element of size 3x3, and update the third binary image with the obtained result:
    第一重复调用单元,用于重复调用第三二值处理单元五次;A first repeated calling unit for repeatedly calling the third binary processing unit five times;
    真假划分单元,用于:将第二二值图像处理单元最后更新的第二二值图像中为真的对应像素作为所有前景像素集合F c、将第三二值图像处理单元最后更新的第三二值图像中为假的对应像素作为所有背景像素集合B C、其余像素作为所有未知像素集合Z CA true / false division unit, configured to: use the corresponding corresponding pixels in the second binary image last updated by the second binary image processing unit as true to set all foreground pixel sets F c ; Corresponding false pixels in the three binary image are all background pixel sets B C and the remaining pixels are all unknown pixel sets Z C.
  8. 根据权利要求6所述的装置,其中,第二划分模块还包括:The apparatus according to claim 6, wherein the second dividing module further comprises:
    均值滤波单元,用于:对第一图像进行均值滤波得到第三图像;An average filtering unit, configured to: perform average filtering on the first image to obtain a third image;
    第二图像生成单元,用于:所述第一图像和第三图像通过如下公式生成第二图像:A second image generating unit is configured to generate the second image by using the following formula:
    Figure PCTCN2019088278-appb-100017
    Figure PCTCN2019088278-appb-100017
    其中,IM 2表示叠加后第二图像上第k个像素的灰度值,x r表示第一图像上第k个像素x k的邻域像素,N k表示以x k为中心的邻域内的像素个数,
    Figure PCTCN2019088278-appb-100018
    表示对第一图像进行均值滤波所得的第三图像上第k个像素的像素值,β取0.5。
    Among them, IM 2 represents the gray value of the k-th pixel on the second image after superimposition, x r represents a neighborhood pixel of the k-th pixel x k on the first image, and N k represents the neighborhood of the neighborhood centered on x k Number of pixels,
    Figure PCTCN2019088278-appb-100018
    Represents the pixel value of the k-th pixel on the third image obtained by performing average filtering on the first image, and β is taken as 0.5.
  9. 根据权利要求6所述的装置,其中,修正模块还包括:The apparatus according to claim 6, wherein the correction module further comprises:
    寻找边缘单元,用于:根据第一图像的第二透明度遮罩和第一图像的第一透明度遮罩,分别寻找其第二透明度遮罩的边缘、第一透明度遮罩的边缘;Finding an edge unit, configured to find the edge of the second transparency mask and the edge of the first transparency mask respectively according to the second transparency mask of the first image and the first transparency mask of the first image;
    确定位置单元,用于:获得第二透明度遮罩的边缘的所有像素的位置,和第一透明度遮罩的边缘的所有像素的位置,并判定第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区域,进而确定位置相同的像素Z spA position determining unit, configured to obtain the positions of all pixels at the edges of the second transparency mask, and the positions of all pixels at the edges of the first transparency mask, and determine the positions of all pixels at the edges of the second transparency mask and An area where the positions of all pixels of the edges of the first transparency mask are coincident, and then pixels Z sp having the same position are determined;
    第一修正单元,用于:分别查找像素Z sp对应于第一图像的第一透明度遮罩的透明度估计值,和对应于第一图像的第二透明度遮罩的透明度估计值,并以二者的平均值作为像素Z sp修正后的透明度估计值; The first correction unit is configured to find the transparency estimate value of the first transparency mask corresponding to the first image of the pixel Z sp and the transparency estimate value of the second transparency mask corresponding to the first image, respectively. The average value of the pixel is used as the estimated transparency value of the pixel Z sp ;
    第二修正单元,用于:以像素Z sp修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 The second correction unit is configured to correct the first transparency mask of the first image by using the pixel Z sp corrected transparency estimation value.
  10. 根据权利要求9所述的装置,其中,所述确定位置单元进一步包括:The apparatus according to claim 9, wherein the determining unit further comprises:
    不同位置子单元,用于:根据判定的第二透明度遮罩的边缘的所有像素的位置和第一透明度遮罩的边缘的所有像素的位置重合的区 域,进一步确定位置不同的像素Z dp,包括:位于第二透明度遮罩的边缘的像素Z dp2和位于第一透明度遮罩的边缘的像素Z dp1Different position subunits are used to further determine pixels with different positions Z dp according to the positions where all the pixels of the edges of the second transparency mask and the positions of all the pixels of the edges of the first transparency mask overlap, including: : A pixel Z dp2 located at the edge of the second transparency mask and a pixel Z dp1 located at the edge of the first transparency mask;
    闭合子单元,用于:利用所述位置不同的像素Z dp和位置相同的像素Z sp,获得第二透明度遮罩的边缘与第一透明度遮罩的边缘所确定的:边缘与边缘之间所封闭的闭合区域,以及所述闭合区域的所有封闭像素的位置; A closed subunit, configured to use the pixels Z dp at different positions and the pixels Z sp at the same position to obtain the edge determined by the second transparency mask and the edge of the first transparency mask: A closed closed area, and the positions of all closed pixels of the closed area;
    多次查找子单元,用于:Find subunits multiple times for:
    (1)查找像素Z dp1的位置所对应的像素于第一图像的第一透明度遮罩的透明度估计值,并查找该对应的像素于第二图像中的透明度值,并以二者的平均值作为像素Z dp1修正后的透明度估计值; (1) Find the transparency value of the pixel corresponding to the position of the pixel Z dp1 in the first transparency mask of the first image, and find the transparency value of the corresponding pixel in the second image, and use the average of the two As the pixel Z dp1 corrected transparency estimate;
    (2)查找像素Z dp2的位置所对应的像素于第一图像的第二透明度遮罩的透明度估计值,并查找该对应的像素于第一图像中的透明度值,并以二者的平均值作为像素Z dp2修正后的透明度估计值; (2) Find the transparency value of the pixel corresponding to the position of the pixel Z dp2 in the second transparency mask of the first image, and find the transparency value of the corresponding pixel in the first image, and use the average of the two As the pixel Z dp2 corrected transparency estimate;
    复杂修正子单元,用于:结合像素Z dp1修正后的透明度估计值和像素Z dp2修正后的透明度估计值,修正所述第一图像的第一透明度遮罩。 The complex correction subunit is configured to: modify the first transparency mask of the first image in combination with the pixel Z dp1 modified transparency estimation value and the pixel Z dp2 modified transparency estimation value.
PCT/CN2019/088278 2018-09-26 2019-05-24 Video foreground target extraction method and apparatus WO2020062898A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNPCT/CN2018/107514 2018-09-26
CN2018107514 2018-09-26

Publications (1)

Publication Number Publication Date
WO2020062898A1 true WO2020062898A1 (en) 2020-04-02

Family

ID=68140393

Family Applications (5)

Application Number Title Priority Date Filing Date
PCT/CN2019/088278 WO2020062898A1 (en) 2018-09-26 2019-05-24 Video foreground target extraction method and apparatus
PCT/CN2019/088279 WO2020062899A1 (en) 2018-09-26 2019-05-24 Method for obtaining transparency masks by means of foreground and background pixel pairs and grayscale information
PCT/CN2019/101273 WO2020063189A1 (en) 2018-09-26 2019-08-19 Video target trajectory extraction method and device
PCT/CN2019/105028 WO2020063321A1 (en) 2018-09-26 2019-09-10 Video processing method based on semantic analysis and device
PCT/CN2019/106616 WO2020063436A1 (en) 2018-09-26 2019-09-19 Method and apparatus for analysing deep learning (dnn) based classroom learning behaviour

Family Applications After (4)

Application Number Title Priority Date Filing Date
PCT/CN2019/088279 WO2020062899A1 (en) 2018-09-26 2019-05-24 Method for obtaining transparency masks by means of foreground and background pixel pairs and grayscale information
PCT/CN2019/101273 WO2020063189A1 (en) 2018-09-26 2019-08-19 Video target trajectory extraction method and device
PCT/CN2019/105028 WO2020063321A1 (en) 2018-09-26 2019-09-10 Video processing method based on semantic analysis and device
PCT/CN2019/106616 WO2020063436A1 (en) 2018-09-26 2019-09-19 Method and apparatus for analysing deep learning (dnn) based classroom learning behaviour

Country Status (2)

Country Link
CN (5) CN110378867A (en)
WO (5) WO2020062898A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989962B (en) * 2021-02-24 2024-01-05 上海商汤智能科技有限公司 Track generation method, track generation device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529495A (en) * 2006-09-19 2009-09-09 奥多比公司 Image mask generation
CN101588459A (en) * 2009-06-26 2009-11-25 北京交通大学 A kind of video keying processing method
CN102163216A (en) * 2010-11-24 2011-08-24 广州市动景计算机科技有限公司 Picture display method and device thereof
CN104680482A (en) * 2015-03-09 2015-06-03 华为技术有限公司 Method and device for image processing
WO2017213923A1 (en) * 2016-06-09 2017-12-14 Lytro, Inc. Multi-view scene segmentation and propagation
CN107516319A (en) * 2017-09-05 2017-12-26 中北大学 A kind of high accuracy simple interactive stingy drawing method, storage device and terminal

Family Cites Families (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6361910B1 (en) * 2000-02-03 2002-03-26 Applied Materials, Inc Straight line defect detection
US6870945B2 (en) * 2001-06-04 2005-03-22 University Of Washington Video object tracking by estimating and subtracting background
US7466842B2 (en) * 2005-05-20 2008-12-16 Mitsubishi Electric Research Laboratories, Inc. Modeling low frame rate videos with bayesian estimation
US8520972B2 (en) * 2008-09-12 2013-08-27 Adobe Systems Incorporated Image decomposition
CN101686338B (en) * 2008-09-26 2013-12-25 索尼株式会社 System and method for partitioning foreground and background in video
CN101621615A (en) * 2009-07-24 2010-01-06 南京邮电大学 Self-adaptive background modeling and moving target detecting method
US8625888B2 (en) * 2010-07-21 2014-01-07 Microsoft Corporation Variable kernel size image matting
US8386964B2 (en) * 2010-07-21 2013-02-26 Microsoft Corporation Interactive image matting
CN102456212A (en) * 2010-10-19 2012-05-16 北大方正集团有限公司 Separation method and system for visible watermark in numerical image
CN102236901B (en) * 2011-06-30 2013-06-05 南京大学 Method for tracking target based on graph theory cluster and color invariant space
US8744123B2 (en) * 2011-08-29 2014-06-03 International Business Machines Corporation Modeling of temporarily static objects in surveillance video data
US8731315B2 (en) * 2011-09-12 2014-05-20 Canon Kabushiki Kaisha Image compression and decompression for image matting
US9305357B2 (en) * 2011-11-07 2016-04-05 General Electric Company Automatic surveillance video matting using a shape prior
CN102651135B (en) * 2012-04-10 2015-06-17 电子科技大学 Optimized direction sampling-based natural image matting method
US8792718B2 (en) * 2012-06-29 2014-07-29 Adobe Systems Incorporated Temporal matte filter for video matting
CN102999892B (en) * 2012-12-03 2015-08-12 东华大学 Based on the depth image of region mask and the intelligent method for fusing of RGB image
CN103366364B (en) * 2013-06-07 2016-06-29 太仓中科信息技术研究院 A kind of stingy drawing method based on color distortion
AU2013206597A1 (en) * 2013-06-28 2015-01-22 Canon Kabushiki Kaisha Depth constrained superpixel-based depth map refinement
WO2015048694A2 (en) * 2013-09-27 2015-04-02 Pelican Imaging Corporation Systems and methods for depth-assisted perspective distortion correction
US20150091891A1 (en) * 2013-09-30 2015-04-02 Dumedia, Inc. System and method for non-holographic teleportation
CN104112144A (en) * 2013-12-17 2014-10-22 深圳市华尊科技有限公司 Person and vehicle identification method and device
WO2015134996A1 (en) * 2014-03-07 2015-09-11 Pelican Imaging Corporation System and methods for depth regularization and semiautomatic interactive matting using rgb-d images
CN104952089B (en) * 2014-03-26 2019-02-15 腾讯科技(深圳)有限公司 A kind of image processing method and system
CN103903230A (en) * 2014-03-28 2014-07-02 哈尔滨工程大学 Video image sea fog removal and clearing method
CN105590307A (en) * 2014-10-22 2016-05-18 华为技术有限公司 Transparency-based matting method and apparatus
CN104573688B (en) * 2015-01-19 2017-08-25 电子科技大学 Mobile platform tobacco laser code intelligent identification Method and device based on deep learning
CN104935832B (en) * 2015-03-31 2019-07-12 浙江工商大学 For the video keying method with depth information
CN105100646B (en) * 2015-08-31 2018-09-11 北京奇艺世纪科技有限公司 Method for processing video frequency and device
CN105243670B (en) * 2015-10-23 2018-04-06 北京航空航天大学 A kind of sparse and accurate extracting method of video foreground object of low-rank Combined expression
CN105809679B (en) * 2016-03-04 2019-06-18 李云栋 Mountain railway side slope rockfall detection method based on visual analysis
CN106204567B (en) * 2016-07-05 2019-01-29 华南理工大学 A kind of natural background video matting method
CN117864918A (en) * 2016-07-29 2024-04-12 奥的斯电梯公司 Monitoring system for passenger conveyor, passenger conveyor and monitoring method thereof
CN107872644B (en) * 2016-09-23 2020-10-09 亿阳信通股份有限公司 Video monitoring method and device
CN106778810A (en) * 2016-11-23 2017-05-31 北京联合大学 Original image layer fusion method and system based on RGB feature Yu depth characteristic
US10198621B2 (en) * 2016-11-28 2019-02-05 Sony Corporation Image-Processing device and method for foreground mask correction for object segmentation
CN106952276A (en) * 2017-03-20 2017-07-14 成都通甲优博科技有限责任公司 A kind of image matting method and device
CN107194867A (en) * 2017-05-14 2017-09-22 北京工业大学 A kind of stingy picture synthetic method based on CUDA
CN107273905B (en) * 2017-06-14 2020-05-08 电子科技大学 Target active contour tracking method combined with motion information
CN107230182B (en) * 2017-08-03 2021-11-09 腾讯科技(深圳)有限公司 Image processing method and device and storage medium
CN108399361A (en) * 2018-01-23 2018-08-14 南京邮电大学 A kind of pedestrian detection method based on convolutional neural networks CNN and semantic segmentation
CN108391118B (en) * 2018-03-21 2020-11-06 惠州学院 Display system for realizing 3D image based on projection mode
CN108320298B (en) * 2018-04-28 2022-01-28 亮风台(北京)信息科技有限公司 Visual target tracking method and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529495A (en) * 2006-09-19 2009-09-09 奥多比公司 Image mask generation
CN101588459A (en) * 2009-06-26 2009-11-25 北京交通大学 A kind of video keying processing method
CN102163216A (en) * 2010-11-24 2011-08-24 广州市动景计算机科技有限公司 Picture display method and device thereof
CN104680482A (en) * 2015-03-09 2015-06-03 华为技术有限公司 Method and device for image processing
WO2017213923A1 (en) * 2016-06-09 2017-12-14 Lytro, Inc. Multi-view scene segmentation and propagation
CN107516319A (en) * 2017-09-05 2017-12-26 中北大学 A kind of high accuracy simple interactive stingy drawing method, storage device and terminal

Also Published As

Publication number Publication date
WO2020063321A1 (en) 2020-04-02
CN110378867A (en) 2019-10-25
CN110335288A (en) 2019-10-15
CN110659562A (en) 2020-01-07
CN110363788A (en) 2019-10-22
CN110516534A (en) 2019-11-29
WO2020063189A1 (en) 2020-04-02
WO2020062899A1 (en) 2020-04-02
WO2020063436A1 (en) 2020-04-02

Similar Documents

Publication Publication Date Title
US11610082B2 (en) Method and apparatus for training neural network model used for image processing, and storage medium
JP2022528294A (en) Video background subtraction method using depth
US20200143171A1 (en) Segmenting Objects In Video Sequences
CN109816666B (en) Symmetrical full convolution neural network model construction method, fundus image blood vessel segmentation device, computer equipment and storage medium
CN111383232B (en) Matting method, matting device, terminal equipment and computer readable storage medium
WO2020001222A1 (en) Image processing method, apparatus, computer readable medium, and electronic device
WO2020194254A1 (en) Quality assessment of a video
CN111598777A (en) Sky cloud image processing method, computer device and readable storage medium
JP7328096B2 (en) Image processing device, image processing method, and program
CN110288560A (en) A kind of image fuzzy detection method and device
WO2020062898A1 (en) Video foreground target extraction method and apparatus
CN113935934A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
JP7312026B2 (en) Image processing device, image processing method and program
CN110738625B (en) Image resampling method, device, terminal and computer readable storage medium
Hua et al. Low-light image enhancement based on joint generative adversarial network and image quality assessment
WO2021042641A1 (en) Image segmentation method and apparatus
CN111161299B (en) Image segmentation method, storage medium and electronic device
CN113191376A (en) Image processing method, image processing device, electronic equipment and readable storage medium
CN114764839A (en) Dynamic video generation method and device, readable storage medium and terminal equipment
CN111754411B (en) Image noise reduction method, image noise reduction device and terminal equipment
JP2016162421A (en) Information processor, information processing method and program
JP6468642B2 (en) Information terminal equipment
Borraa An Efficient Shape Adaptive Techniques for the Digital Image Denoising
Shajahan et al. Direction oriented block based inpainting using morphological operations
CN112995634B (en) Image white balance processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19867914

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19867914

Country of ref document: EP

Kind code of ref document: A1