CN112601029B - Video segmentation method, terminal and storage medium with known background prior information - Google Patents
Video segmentation method, terminal and storage medium with known background prior information Download PDFInfo
- Publication number
- CN112601029B CN112601029B CN202011340968.0A CN202011340968A CN112601029B CN 112601029 B CN112601029 B CN 112601029B CN 202011340968 A CN202011340968 A CN 202011340968A CN 112601029 B CN112601029 B CN 112601029B
- Authority
- CN
- China
- Prior art keywords
- background
- current frame
- frame
- video
- entering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/174—Segmentation; Edge detection involving the use of two or more images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Studio Circuits (AREA)
Abstract
The invention relates to a video segmentation method of known background prior information, which comprises the steps of firstly matching a current frame of a video with the background prior information, predicting to obtain a complete background of the current frame, and then segmenting a target foreground of the current frame. The invention can accurately segment when the lens moves greatly, and ensures the video segmentation effect.
Description
Technical Field
The present invention relates to the field of video processing technologies, and in particular, to a video segmentation method, a terminal, and a storage medium for knowing background prior information.
Background
The existing video foreground and background segmentation mode generally acquires an image through a camera, and then uses an artificial matting or color key matting mode to scratch out a foreground area in the image, so as to realize the segmentation of a video foreground and a background. However, the manual image matting mode is complex to operate, and the convenience of dividing the video is low. While chroma keying can directly key out foreground regions in an image, this method requires relying on a relatively large solid background with the foreground.
Disclosure of Invention
The invention aims to provide a video segmentation method, a terminal and a storage medium for known background prior information, which are convenient and applicable to video foreground and background segmentation of any background, and can accurately segment a shot when the shot moves greatly, thereby ensuring the video segmentation effect.
The technical scheme adopted by the invention for solving the technical problems is as follows: the video segmentation method for the known background prior information is provided, a current frame of a video is matched with the background prior information, the complete background of the current frame is obtained through prediction, and then the target foreground of the current frame is segmented.
The video segmentation method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of the video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (3), otherwise, entering the step (4);
(4) Matching the background frame to the background of the current frame in a correction mode;
(5) And segmenting the current frame to obtain the foreground of the current frame.
The background frame in the step (1) is a panoramic picture, and the panoramic picture is obtained by synthesizing a plurality of pictures at different angles.
The step (3) is specifically as follows: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5). The similarity can be picture difference, structure similarity, feature map similarity and the like.
The step (4) is specifically as follows: and respectively extracting and matching key points of a pre-stored background frame and a current frame by using a key point matching algorithm, selecting some key points with good matching, calculating a transformation matrix, cutting out the corresponding background part in the pre-stored background frame, transforming to the same visual angle as the current frame by using the transformation matrix, and inputting as a new background of the current frame.
The step (4) is specifically as follows: inputting a pre-stored background frame and a current frame into a convolutional neural network, wherein the output of the convolutional neural network is a series of spatial transformation relation mapping maps, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part into the same visual angle as the current frame by using the spatial transformation relation mapping maps, and inputting the visual angle as a new background of the current frame.
The step (5) is specifically as follows: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
The technical scheme adopted by the invention for solving the technical problems is as follows: there is provided a terminal comprising a memory and a processor, the memory having stored thereon a video processing program executable on the processor, the video processing program when executed by the processor implementing the steps of the video segmentation method described above.
The technical scheme adopted by the invention for solving the technical problems is as follows: there is provided a computer readable storage medium having stored thereon a video processing program which, when executed by a processor, implements the steps of the video segmentation method described above.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention detects the matching condition of the characteristic points of the current frame and the background frame, and automatically matches the background frame to the background of the current frame in a correction mode when the characteristic points of the current frame and the background frame are not matched, thereby ensuring that the lens can be accurately segmented when being moved greatly and ensuring the video segmentation effect.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
fig. 2 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The embodiment of the invention relates to a video segmentation method of known background prior information, which comprises the steps of matching a current frame of a video with the background prior information, predicting to obtain a complete background of the current frame, and then segmenting a target foreground of the current frame. As shown in fig. 1, the method comprises the following steps: setting a background frame and storing the background frame; extracting a current frame of the video stream; judging whether the current frame is matched with the background frame, if not, matching the background frame to the background of the current frame in a correction mode; segmenting the current frame to obtain the foreground of the current frame; and synthesizing the foreground and background videos.
Fig. 2 is a schematic diagram illustrating a terminal configuration of a hardware operating environment according to the present embodiment. The terminal of the embodiment can be a terminal device with a video shooting function, such as a smart phone, a tablet computer and a PC terminal.
The terminal includes: a processor (e.g., CPU), a communications bus, a user interface, a network interface, and memory. Wherein the communication bus is used for realizing connection communication among the components. The user interface may include an interface for connecting an input device and an output device. The network interface may include standard wired and wireless interfaces. The memory may be a high speed RAM memory or a stable memory such as a disk memory. The memory may also be a processor-independent storage device.
The terminal can also comprise a camera, an RF circuit, a sensor, an audio circuit, a WIFI module and the like.
A memory, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a video processing program. The operating system is a program for managing and controlling the terminal and software resources, and supports the running of a network communication module, a user interface module, a video processing program and other programs or software; the network communication module is used for managing and controlling the network interface; the user interface module is used for managing and controlling the user interface.
In the terminal, a network interface is mainly used for connecting a server or external equipment and carrying out data communication with the server or external equipment; the user interface is mainly used for connecting a terminal interface; the terminal calls the video processing program stored in the memory through the processor to realize the following steps:
step 1, setting a background frame and storing the background frame. The method comprises the steps of obtaining a clear background picture shot when a person leaves the background or obtaining a panoramic picture through synthesis of a plurality of pictures at different angles.
Step 2, extracting a current frame of a video stream, wherein the video stream is a video stream with a foreground and can also be a disordered picture sequence with the foreground;
and 3, judging whether the current frame is matched with a pre-stored background frame, if not, entering the step 4, otherwise, entering the step 5. The method specifically comprises the following steps: calculating the similarity of the current frame and the pre-stored area outside the segmentation mask area of the background frame, if the similarity is lower than a set threshold value, indicating that the current frame and the pre-stored area are not matched, and therefore, entering the step 4 for correction, otherwise, indicating that the current frame and the pre-stored area are matched, and directly entering the step 5 for segmentation. In this embodiment, the similarity may be a difference between pictures, a structural similarity between pictures, or a feature map similarity between pictures.
And 4, matching the pre-stored background frame to the background of the current frame in a correction mode. In this step, a key point matching algorithm and a convolutional neural network may be used to achieve the purpose of correction, which is specifically as follows:
when a key point matching algorithm is used, key point extraction and matching are respectively carried out on a pre-stored background frame and a current frame, part of key points which are well matched are selected, a transformation matrix is calculated, a corresponding background part in the pre-stored background frame is cut out, the transformation matrix is used for transforming to a visual angle which is the same as that of the current frame, and the cut background part is used as a new background input of the current frame.
When the convolutional neural network is used, inputting a pre-stored background frame and a current frame into the convolutional neural network, wherein the output of the convolutional neural network is a series of space transformation relation mapping maps, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part to the same visual angle as the current frame by using the space transformation relation mapping maps, and taking the cut background part as a new background input of the current frame.
And 5, segmenting the current frame to obtain the foreground of the current frame. Specifically, inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map (the background part in the current frame feature map is completely the same as the background feature map); fusing the current frame feature map and the background feature map (namely, matching and comparing the features of the current frame feature map and the background feature map in different scale feature spaces), reconstructing the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame by using a mask based on the alpha mask image to obtain the foreground of the current frame. In order to make the segmented foreground have better effect, the edges of the segmented foreground can be subjected to post-processing operations such as sharpening.
It is easy to find that, the embodiment matches the background picture with the region to which the current video belongs, and segments the object different from the background in the current video frame, thereby ensuring that the lens can be accurately segmented when being moved greatly, and ensuring the video segmentation effect.
Claims (8)
1. A video segmentation method of known background prior information is characterized in that a current frame of a video is matched with the background prior information, a complete background of the current frame is obtained through prediction, and then a target foreground of the current frame is segmented, and the method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of the video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (4), otherwise, entering the step (5);
(4) Matching the background frame to the background of the current frame in a correction mode; the method specifically comprises the following steps: respectively extracting and matching key points of a pre-stored background frame and a current frame by using a key point matching algorithm, selecting some key points with good matching, calculating a transformation matrix, cutting out the corresponding background part in the pre-stored background frame, transforming to the same visual angle as the current frame by using the transformation matrix, and inputting as a new background of the current frame;
(5) Segmenting the current frame to obtain the foreground of the current frame; the method specifically comprises the following steps: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
2. The video segmentation method according to claim 1, wherein the background frame in step (1) is a panoramic picture, and the panoramic picture is synthesized from a plurality of pictures at different angles.
3. The video segmentation method according to claim 1, wherein the step (3) is specifically: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5).
4. A video segmentation method of known background prior information is characterized in that a current frame of a video is matched with the background prior information, a complete background of the current frame is obtained through prediction, and then a target foreground of the current frame is segmented, and the method comprises the following steps:
(1) Setting a background frame and storing the background frame;
(2) Extracting a current frame of a video stream;
(3) Judging whether the current frame is matched with the background frame, if not, entering the step (4), otherwise, entering the step (5);
(4) Matching the background frame to the background of the current frame in a correction mode; the method comprises the following specific steps: inputting a pre-stored background frame and a current frame into a convolutional neural network, wherein the output of the convolutional neural network is a series of space transformation relation mapping images, then cutting out a corresponding background part in the pre-stored background frame, transforming the background part into the same visual angle as the current frame by using the space transformation relation mapping images, and inputting the background part as a new background of the current frame;
(5) Segmenting the current frame to obtain the foreground of the current frame; the method specifically comprises the following steps: inputting a pre-stored background frame into a coding model to obtain a background characteristic diagram; inputting the current frame into the coding model to carry out feature decomposition to obtain a current frame feature map; fusing the current frame feature map and the background feature map, performing feature decoding on the fused feature map through a decoding model, and outputting an alpha mask map; and segmenting the current frame based on the alpha mask image to obtain the foreground of the current frame.
5. The video segmentation method according to claim 4, wherein the background frame in step (1) is a panoramic picture, and the panoramic picture is obtained by synthesizing a plurality of pictures at different angles.
6. The video segmentation method according to claim 4, wherein the step (3) is specifically: and (4) calculating the similarity of the areas outside the segmentation mask area of the current frame and the background frame, if the similarity is lower than a threshold value, entering the step (4), and if not, entering the step (5).
7. A terminal comprising a memory and a processor, the memory having stored thereon a video processing program executable on the processor, the video processing program when executed by the processor implementing the steps of the video segmentation method as claimed in any one of claims 1 to 6.
8. A computer-readable storage medium, having stored thereon a video processing program which, when executed by a processor, implements the steps of the video segmentation method as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011340968.0A CN112601029B (en) | 2020-11-25 | 2020-11-25 | Video segmentation method, terminal and storage medium with known background prior information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011340968.0A CN112601029B (en) | 2020-11-25 | 2020-11-25 | Video segmentation method, terminal and storage medium with known background prior information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112601029A CN112601029A (en) | 2021-04-02 |
CN112601029B true CN112601029B (en) | 2023-01-03 |
Family
ID=75183962
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011340968.0A Active CN112601029B (en) | 2020-11-25 | 2020-11-25 | Video segmentation method, terminal and storage medium with known background prior information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112601029B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114821399B (en) * | 2022-04-07 | 2024-06-04 | 厦门大学 | Intelligent classroom-oriented blackboard-writing automatic extraction method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101216888A (en) * | 2008-01-14 | 2008-07-09 | 浙江大学 | A video foreground extracting method under conditions of view angle variety based on fast image registration |
CN101676953A (en) * | 2008-08-22 | 2010-03-24 | 奥多比公司 | Automatic video image segmentation |
WO2017181892A1 (en) * | 2016-04-19 | 2017-10-26 | 中兴通讯股份有限公司 | Foreground segmentation method and device |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040032906A1 (en) * | 2002-08-19 | 2004-02-19 | Lillig Thomas M. | Foreground segmentation for digital video |
GB0818561D0 (en) * | 2008-10-09 | 2008-11-19 | Isis Innovation | Visual tracking of objects in images, and segmentation of images |
CN104268866B (en) * | 2014-09-19 | 2017-03-01 | 西安电子科技大学 | The video sequence method for registering being combined with background information based on movable information |
US20170116741A1 (en) * | 2015-10-26 | 2017-04-27 | Futurewei Technologies, Inc. | Apparatus and Methods for Video Foreground-Background Segmentation with Multi-View Spatial Temporal Graph Cuts |
CN106846336B (en) * | 2017-02-06 | 2022-07-15 | 腾讯科技(上海)有限公司 | Method and device for extracting foreground image and replacing image background |
CN111553923B (en) * | 2019-04-01 | 2024-02-23 | 上海卫莎网络科技有限公司 | Image processing method, electronic equipment and computer readable storage medium |
-
2020
- 2020-11-25 CN CN202011340968.0A patent/CN112601029B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101216888A (en) * | 2008-01-14 | 2008-07-09 | 浙江大学 | A video foreground extracting method under conditions of view angle variety based on fast image registration |
CN101676953A (en) * | 2008-08-22 | 2010-03-24 | 奥多比公司 | Automatic video image segmentation |
WO2017181892A1 (en) * | 2016-04-19 | 2017-10-26 | 中兴通讯股份有限公司 | Foreground segmentation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN112601029A (en) | 2021-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108921782B (en) | Image processing method, device and storage medium | |
CN106797451B (en) | Visual object tracking system with model validation and management | |
TWI543610B (en) | Electronic device and image selection method thereof | |
WO2022078041A1 (en) | Occlusion detection model training method and facial image beautification method | |
CN108154086B (en) | Image extraction method and device and electronic equipment | |
US10477220B1 (en) | Object segmentation in a sequence of color image frames based on adaptive foreground mask upsampling | |
CN112053417B (en) | Image processing method, device and system and computer readable storage medium | |
CN112288816B (en) | Pose optimization method, pose optimization device, storage medium and electronic equipment | |
CN112381828A (en) | Positioning method, device, medium and equipment based on semantic and depth information | |
CN112270755A (en) | Three-dimensional scene construction method and device, storage medium and electronic equipment | |
WO2022194079A1 (en) | Sky region segmentation method and apparatus, computer device, and storage medium | |
CN112990197A (en) | License plate recognition method and device, electronic equipment and storage medium | |
CN111080665B (en) | Image frame recognition method, device, equipment and computer storage medium | |
CN112601029B (en) | Video segmentation method, terminal and storage medium with known background prior information | |
CN113205011B (en) | Image mask determining method and device, storage medium and electronic equipment | |
CN111079624B (en) | Sample information acquisition method and device, electronic equipment and medium | |
WO2023174063A1 (en) | Background replacement method and electronic device | |
CN116485944A (en) | Image processing method and device, computer readable storage medium and electronic equipment | |
US20230131418A1 (en) | Two-dimensional (2d) feature database generation | |
CN114119405A (en) | Image processing method and device, computer readable storage medium and electronic device | |
CN113613024A (en) | Video preprocessing method and device | |
CN113538462A (en) | Image processing method and device, computer readable storage medium and electronic device | |
CN116228607B (en) | Image processing method and electronic device | |
CN113284077A (en) | Image processing method, image processing device, communication equipment and readable storage medium | |
CN112308809A (en) | Image synthesis method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |