CN112601029B - Video segmentation method, terminal and storage medium with known background prior information - Google Patents

Video segmentation method, terminal and storage medium with known background prior information Download PDF

Info

Publication number
CN112601029B
CN112601029B CN202011340968.0A CN202011340968A CN112601029B CN 112601029 B CN112601029 B CN 112601029B CN 202011340968 A CN202011340968 A CN 202011340968A CN 112601029 B CN112601029 B CN 112601029B
Authority
CN
China
Prior art keywords
background
current frame
frame
video
entering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011340968.0A
Other languages
Chinese (zh)
Other versions
CN112601029A (en
Inventor
赵维杰
富宸
徐孝成
王晨宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Weisha Network Technology Co ltd
Original Assignee
Shanghai Weisha Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Weisha Network Technology Co ltd filed Critical Shanghai Weisha Network Technology Co ltd
Priority to CN202011340968.0A priority Critical patent/CN112601029B/en
Publication of CN112601029A publication Critical patent/CN112601029A/en
Application granted granted Critical
Publication of CN112601029B publication Critical patent/CN112601029B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Circuits (AREA)

Abstract

The invention relates to a video segmentation method of known background prior information, which comprises the steps of firstly matching a current frame of a video with the background prior information, predicting to obtain a complete background of the current frame, and then segmenting a target foreground of the current frame. The invention can accurately segment when the lens moves greatly, and ensures the video segmentation effect.

Description

Video segmentation method, terminal and storage medium with known background prior information
Technical Field
The present invention relates to the field of video processing technologies, and in particular, to a video segmentation method, a terminal, and a storage medium for knowing background prior information.
Background
The existing video foreground and background segmentation mode generally acquires an image through a camera, and then uses an artificial matting or color key matting mode to scratch out a foreground area in the image, so as to realize the segmentation of a video foreground and a background. However, the manual image matting mode is complex to operate, and the convenience of dividing the video is low. While chroma keying can directly key out foreground regions in an image, this method requires relying on a relatively large solid background with the foreground.
Disclosure of Invention
The invention aims to provide a video segmentation method, a terminal and a storage medium for known background prior information, which are convenient and applicable to video foreground and background segmentation of any background, and can accurately segment a shot when the shot moves greatly, thereby ensuring the video segmentation effect.
The technical scheme adopted by the invention for solving the technical problems is as follows: the video segmentation method for the known background prior information is provided, a current frame of a video is matched with the background prior information, the complete background of the current frame is obtained through prediction, and then the target foreground of the current frame is segmented.
The video segmentation method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of the video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (3), otherwise, entering the step (4);
(4) Matching the background frame to the background of the current frame in a correction mode;
(5) And segmenting the current frame to obtain the foreground of the current frame.
The background frame in the step (1) is a panoramic picture, and the panoramic picture is obtained by synthesizing a plurality of pictures at different angles.
The step (3) is specifically as follows: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5). The similarity can be picture difference, structure similarity, feature map similarity and the like.
The step (4) is specifically as follows: and respectively extracting and matching key points of a pre-stored background frame and a current frame by using a key point matching algorithm, selecting some key points with good matching, calculating a transformation matrix, cutting out the corresponding background part in the pre-stored background frame, transforming to the same visual angle as the current frame by using the transformation matrix, and inputting as a new background of the current frame.
The step (4) is specifically as follows: inputting a pre-stored background frame and a current frame into a convolutional neural network, wherein the output of the convolutional neural network is a series of spatial transformation relation mapping maps, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part into the same visual angle as the current frame by using the spatial transformation relation mapping maps, and inputting the visual angle as a new background of the current frame.
The step (5) is specifically as follows: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
The technical scheme adopted by the invention for solving the technical problems is as follows: there is provided a terminal comprising a memory and a processor, the memory having stored thereon a video processing program executable on the processor, the video processing program when executed by the processor implementing the steps of the video segmentation method described above.
The technical scheme adopted by the invention for solving the technical problems is as follows: there is provided a computer readable storage medium having stored thereon a video processing program which, when executed by a processor, implements the steps of the video segmentation method described above.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention detects the matching condition of the characteristic points of the current frame and the background frame, and automatically matches the background frame to the background of the current frame in a correction mode when the characteristic points of the current frame and the background frame are not matched, thereby ensuring that the lens can be accurately segmented when being moved greatly and ensuring the video segmentation effect.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
fig. 2 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The embodiment of the invention relates to a video segmentation method of known background prior information, which comprises the steps of matching a current frame of a video with the background prior information, predicting to obtain a complete background of the current frame, and then segmenting a target foreground of the current frame. As shown in fig. 1, the method comprises the following steps: setting a background frame and storing the background frame; extracting a current frame of the video stream; judging whether the current frame is matched with the background frame, if not, matching the background frame to the background of the current frame in a correction mode; segmenting the current frame to obtain the foreground of the current frame; and synthesizing the foreground and background videos.
Fig. 2 is a schematic diagram illustrating a terminal configuration of a hardware operating environment according to the present embodiment. The terminal of the embodiment can be a terminal device with a video shooting function, such as a smart phone, a tablet computer and a PC terminal.
The terminal includes: a processor (e.g., CPU), a communications bus, a user interface, a network interface, and memory. Wherein the communication bus is used for realizing connection communication among the components. The user interface may include an interface for connecting an input device and an output device. The network interface may include standard wired and wireless interfaces. The memory may be a high speed RAM memory or a stable memory such as a disk memory. The memory may also be a processor-independent storage device.
The terminal can also comprise a camera, an RF circuit, a sensor, an audio circuit, a WIFI module and the like.
A memory, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a video processing program. The operating system is a program for managing and controlling the terminal and software resources, and supports the running of a network communication module, a user interface module, a video processing program and other programs or software; the network communication module is used for managing and controlling the network interface; the user interface module is used for managing and controlling the user interface.
In the terminal, a network interface is mainly used for connecting a server or external equipment and carrying out data communication with the server or external equipment; the user interface is mainly used for connecting a terminal interface; the terminal calls the video processing program stored in the memory through the processor to realize the following steps:
step 1, setting a background frame and storing the background frame. The method comprises the steps of obtaining a clear background picture shot when a person leaves the background or obtaining a panoramic picture through synthesis of a plurality of pictures at different angles.
Step 2, extracting a current frame of a video stream, wherein the video stream is a video stream with a foreground and can also be a disordered picture sequence with the foreground;
and 3, judging whether the current frame is matched with a pre-stored background frame, if not, entering the step 4, otherwise, entering the step 5. The method specifically comprises the following steps: calculating the similarity of the current frame and the pre-stored area outside the segmentation mask area of the background frame, if the similarity is lower than a set threshold value, indicating that the current frame and the pre-stored area are not matched, and therefore, entering the step 4 for correction, otherwise, indicating that the current frame and the pre-stored area are matched, and directly entering the step 5 for segmentation. In this embodiment, the similarity may be a difference between pictures, a structural similarity between pictures, or a feature map similarity between pictures.
And 4, matching the pre-stored background frame to the background of the current frame in a correction mode. In this step, a key point matching algorithm and a convolutional neural network may be used to achieve the purpose of correction, which is specifically as follows:
when a key point matching algorithm is used, key point extraction and matching are respectively carried out on a pre-stored background frame and a current frame, part of key points which are well matched are selected, a transformation matrix is calculated, a corresponding background part in the pre-stored background frame is cut out, the transformation matrix is used for transforming to a visual angle which is the same as that of the current frame, and the cut background part is used as a new background input of the current frame.
When the convolutional neural network is used, inputting a pre-stored background frame and a current frame into the convolutional neural network, wherein the output of the convolutional neural network is a series of space transformation relation mapping maps, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part to the same visual angle as the current frame by using the space transformation relation mapping maps, and taking the cut background part as a new background input of the current frame.
And 5, segmenting the current frame to obtain the foreground of the current frame. Specifically, inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map (the background part in the current frame feature map is completely the same as the background feature map); fusing the current frame feature map and the background feature map (namely, matching and comparing the features of the current frame feature map and the background feature map in different scale feature spaces), reconstructing the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame by using a mask based on the alpha mask image to obtain the foreground of the current frame. In order to make the segmented foreground have better effect, the edges of the segmented foreground can be subjected to post-processing operations such as sharpening.
It is easy to find that, the embodiment matches the background picture with the region to which the current video belongs, and segments the object different from the background in the current video frame, thereby ensuring that the lens can be accurately segmented when being moved greatly, and ensuring the video segmentation effect.

Claims (8)

1. A video segmentation method of known background prior information is characterized in that a current frame of a video is matched with the background prior information, a complete background of the current frame is obtained through prediction, and then a target foreground of the current frame is segmented, and the method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of the video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (4), otherwise, entering the step (5);
(4) Matching the background frame to the background of the current frame in a correction mode; the method specifically comprises the following steps: respectively extracting and matching key points of a pre-stored background frame and a current frame by using a key point matching algorithm, selecting some key points with good matching, calculating a transformation matrix, cutting out the corresponding background part in the pre-stored background frame, transforming to the same visual angle as the current frame by using the transformation matrix, and inputting as a new background of the current frame;
(5) Segmenting the current frame to obtain the foreground of the current frame; the method specifically comprises the following steps: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
2. The video segmentation method according to claim 1, wherein the background frame in step (1) is a panoramic picture, and the panoramic picture is synthesized from a plurality of pictures at different angles.
3. The video segmentation method according to claim 1, wherein the step (3) is specifically: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5).
4. A video segmentation method of known background prior information is characterized in that a current frame of a video is matched with the background prior information, a complete background of the current frame is obtained through prediction, and then a target foreground of the current frame is segmented, and the method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of a video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (4), otherwise, entering the step (5);
(4) Matching the background frame to the background of the current frame in a correction mode; the method comprises the following specific steps: inputting a pre-stored background frame and a current frame into a convolutional neural network, wherein the output of the convolutional neural network is a series of space transformation relation mapping images, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part into the same visual angle as the current frame by using the space transformation relation mapping images, and inputting the background part as a new background of the current frame;
(5) Segmenting the current frame to obtain the foreground of the current frame; the method specifically comprises the following steps: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
5. The video segmentation method according to claim 4, wherein the background frame in step (1) is a panoramic picture, and the panoramic picture is obtained by synthesizing a plurality of pictures at different angles.
6. The video segmentation method according to claim 4, wherein the step (3) is specifically: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5).
7. A terminal comprising a memory and a processor, the memory having stored thereon a video processing program executable on the processor, the video processing program when executed by the processor implementing the steps of the video segmentation method as claimed in any one of claims 1 to 6.
8. A computer-readable storage medium, having stored thereon a video processing program which, when executed by a processor, implements the steps of the video segmentation method as claimed in any one of claims 1 to 6.
CN202011340968.0A 2020-11-25 2020-11-25 Video segmentation method, terminal and storage medium with known background prior information Active CN112601029B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011340968.0A CN112601029B (en) 2020-11-25 2020-11-25 Video segmentation method, terminal and storage medium with known background prior information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011340968.0A CN112601029B (en) 2020-11-25 2020-11-25 Video segmentation method, terminal and storage medium with known background prior information

Publications (2)

Publication Number Publication Date
CN112601029A CN112601029A (en) 2021-04-02
CN112601029B true CN112601029B (en) 2023-01-03

Family

ID=75183962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011340968.0A Active CN112601029B (en) 2020-11-25 2020-11-25 Video segmentation method, terminal and storage medium with known background prior information

Country Status (1)

Country Link
CN (1) CN112601029B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821399B (en) * 2022-04-07 2024-06-04 厦门大学 Intelligent classroom-oriented blackboard-writing automatic extraction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216888A (en) * 2008-01-14 2008-07-09 浙江大学 A video foreground extracting method under conditions of view angle variety based on fast image registration
CN101676953A (en) * 2008-08-22 2010-03-24 奥多比公司 Automatic video image segmentation
WO2017181892A1 (en) * 2016-04-19 2017-10-26 中兴通讯股份有限公司 Foreground segmentation method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040032906A1 (en) * 2002-08-19 2004-02-19 Lillig Thomas M. Foreground segmentation for digital video
GB0818561D0 (en) * 2008-10-09 2008-11-19 Isis Innovation Visual tracking of objects in images, and segmentation of images
CN104268866B (en) * 2014-09-19 2017-03-01 西安电子科技大学 The video sequence method for registering being combined with background information based on movable information
US20170116741A1 (en) * 2015-10-26 2017-04-27 Futurewei Technologies, Inc. Apparatus and Methods for Video Foreground-Background Segmentation with Multi-View Spatial Temporal Graph Cuts
CN106846336B (en) * 2017-02-06 2022-07-15 腾讯科技(上海)有限公司 Method and device for extracting foreground image and replacing image background
CN111553923B (en) * 2019-04-01 2024-02-23 上海卫莎网络科技有限公司 Image processing method, electronic equipment and computer readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216888A (en) * 2008-01-14 2008-07-09 浙江大学 A video foreground extracting method under conditions of view angle variety based on fast image registration
CN101676953A (en) * 2008-08-22 2010-03-24 奥多比公司 Automatic video image segmentation
WO2017181892A1 (en) * 2016-04-19 2017-10-26 中兴通讯股份有限公司 Foreground segmentation method and device

Also Published As

Publication number Publication date
CN112601029A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
CN108921782B (en) Image processing method, device and storage medium
CN106797451B (en) Visual object tracking system with model validation and management
TWI543610B (en) Electronic device and image selection method thereof
WO2022078041A1 (en) Occlusion detection model training method and facial image beautification method
CN108154086B (en) Image extraction method and device and electronic equipment
US10477220B1 (en) Object segmentation in a sequence of color image frames based on adaptive foreground mask upsampling
CN112053417B (en) Image processing method, device and system and computer readable storage medium
CN112288816B (en) Pose optimization method, pose optimization device, storage medium and electronic equipment
CN112381828A (en) Positioning method, device, medium and equipment based on semantic and depth information
CN112270755A (en) Three-dimensional scene construction method and device, storage medium and electronic equipment
WO2022194079A1 (en) Sky region segmentation method and apparatus, computer device, and storage medium
CN112990197A (en) License plate recognition method and device, electronic equipment and storage medium
CN111080665B (en) Image frame recognition method, device, equipment and computer storage medium
CN112601029B (en) Video segmentation method, terminal and storage medium with known background prior information
CN113205011B (en) Image mask determining method and device, storage medium and electronic equipment
CN111079624B (en) Sample information acquisition method and device, electronic equipment and medium
WO2023174063A1 (en) Background replacement method and electronic device
CN116485944A (en) Image processing method and device, computer readable storage medium and electronic equipment
US20230131418A1 (en) Two-dimensional (2d) feature database generation
CN114119405A (en) Image processing method and device, computer readable storage medium and electronic device
CN113613024A (en) Video preprocessing method and device
CN113538462A (en) Image processing method and device, computer readable storage medium and electronic device
CN116228607B (en) Image processing method and electronic device
CN113284077A (en) Image processing method, image processing device, communication equipment and readable storage medium
CN112308809A (en) Image synthesis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant