CN112995678B - Video motion compensation method and device and computer equipment - Google Patents

Video motion compensation method and device and computer equipment Download PDF

Info

Publication number
CN112995678B
CN112995678B CN202110199719.2A CN202110199719A CN112995678B CN 112995678 B CN112995678 B CN 112995678B CN 202110199719 A CN202110199719 A CN 202110199719A CN 112995678 B CN112995678 B CN 112995678B
Authority
CN
China
Prior art keywords
foreground
compensation
background
video frames
compensation data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110199719.2A
Other languages
Chinese (zh)
Other versions
CN112995678A (en
Inventor
孙爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth Display Technology Co ltd
Original Assignee
Nanjing Skyworth Institute Of Information Technology Co ltd
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Skyworth Institute Of Information Technology Co ltd, Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Nanjing Skyworth Institute Of Information Technology Co ltd
Priority to CN202110199719.2A priority Critical patent/CN112995678B/en
Publication of CN112995678A publication Critical patent/CN112995678A/en
Application granted granted Critical
Publication of CN112995678B publication Critical patent/CN112995678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/681Motion detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a video motion compensation method and device and computer equipment. The video motion compensation method comprises the following steps: the method comprises the steps of firstly extracting foreground targets and background areas of two adjacent video frames in a video frame sequence, then obtaining foreground compensation data according to pixel point information of the foreground targets of every two adjacent video frames, and obtaining background compensation data according to pixel point coordinate transformation parameters of the background areas of every two adjacent video frames. And fusing the foreground compensation data and the background compensation data to obtain a target compensation image, and inserting the target compensation image between two adjacent video frames. According to the invention, the video frame is divided into the foreground target and the background area, the foreground target and the background area are respectively subjected to motion compensation, and high-quality image motion compensation data are obtained after fusion, so that the data processing process of motion compensation is optimized, the situations of video smear and jitter are reduced, and the use comfort of smart television users is greatly improved.

Description

Video motion compensation method and device and computer equipment
Technical Field
The present invention relates to the field of image processing, and in particular, to a video motion compensation method, apparatus, and computer device.
Background
With the wide popularization of the internet and the development of artificial intelligence technology, intelligent terminal devices such as smart televisions have replaced traditional playing devices to become the mainstream of the market. Meanwhile, the rapid popularization of terminal devices such as smart televisions also drives the development of Motion Compensation technology (MEMC). The MEMC technology is mainly embodied in motion compensation for a jittered video, thereby repairing the jittering of the video.
For a scene with background shaking and high-frequency moving objects, for example, a video of a competition of athletes in a competition field shot under the condition that a camera shakes, the conventional MEMC technology cannot effectively overcome the background shaking to obtain high-quality compensation frame data, the problems of partial ghost and poor anti-shaking effect can occur, and the quality of video restoration and the viewing experience of a user are greatly influenced.
Therefore, the traditional MEMC technology has the problems of poor anti-jitter effect and low compensation quality.
Disclosure of Invention
In order to solve the above technical problem, the present invention provides a video motion compensation method, apparatus and computer device, and the specific scheme is as follows:
in a first aspect, an embodiment of the present application provides a video motion compensation method, where the method includes:
extracting foreground objects and background areas of two adjacent video frames in a video frame sequence;
obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames, and obtaining background compensation data according to the pixel point coordinate transformation parameters of the background area of every two adjacent video frames;
and fusing the foreground compensation data and the background compensation data to obtain a target compensation image, and inserting the target compensation image between two adjacent video frames.
According to a specific embodiment disclosed in the application, the step of extracting the foreground object of the video frame comprises the following steps;
and extracting the foreground target from the video frame through a pre-trained foreground extraction model, wherein the foreground extraction model is a multi-scale full-convolution neural network model.
According to a specific embodiment disclosed in the present application, the step of obtaining foreground compensation data according to the pixel point information of the foreground object of every two adjacent video frames, and obtaining background compensation data according to the pixel point coordinate transformation parameter of the background area of every two adjacent video frames includes:
acquiring pixel point information and centroid coordinates of the foreground target, and compensating the foreground target by using the pixel point information and the centroid coordinates to obtain foreground compensation data;
and solving transformation parameters for affine transformation of the background area by using the characteristic point distance of the background area in every two adjacent video frames, and compensating the background area based on the transformation parameters to obtain background compensation data.
According to a specific embodiment disclosed in the present application, the pixel point information includes coordinates and pixel values of each pixel point, and the step of obtaining the pixel point information and the centroid coordinates of the foreground object includes:
acquiring coordinates and pixel values of each pixel point of the foreground target;
calculating to obtain a centroid coordinate of the foreground target according to the coordinate and the pixel value of each pixel point; wherein the formula for calculating the centroid coordinate is:
Figure BDA0002947698390000031
Figure BDA0002947698390000032
wherein X is the coordinate of the mass center in the direction of the X axis, X i Coordinate, p, of ith pixel point of foreground object in x direction i Is the pixel value of the ith pixel point, Y is the coordinate of the mass center in the Y-axis direction, Y i And the coordinate of the ith pixel point in the y direction is obtained.
According to a specific embodiment disclosed in the present application, the step of compensating the foreground object by using the pixel point information and the centroid coordinate to obtain foreground compensation data includes:
performing Euclidean transformation based on coordinates and centroid coordinates of all edge pixel points corresponding to foreground objects in the two adjacent video frames to obtain a first Euclidean distance value;
storing each first Euclidean distance value to a corresponding coordinate array, and calculating according to all the coordinate arrays to obtain an original motion track;
smoothing the original motion trail by adopting a filter to obtain a smooth motion trail model;
and inputting the pixel point information corresponding to the foreground target into the smooth motion track model to obtain the foreground compensation data of two adjacent video frames.
According to a specific embodiment disclosed in the present application, the step of solving the transformation parameters of the affine transformation performed on the background area by using the feature point distance of the background area in each two adjacent video frames includes:
selecting N non-overlapping rectangular areas from the background area as matching areas, wherein N is a positive integer;
detecting all feature points in each matching area;
matching the feature points of the two adjacent video frames to obtain a corresponding associated feature point combination between the two adjacent video frames, wherein the associated feature point combination comprises two feature points with the same feature value between the two adjacent video frames;
selecting an optimal feature point combination from all the associated feature point combinations according to a second Euclidean distance value and a Hamming distance value corresponding to each associated feature point combination;
and solving the transformation parameters for affine transformation of the optimal characteristic point combination between two adjacent video frames.
According to a specific embodiment disclosed in the present application, the step of selecting an optimal feature point combination from all the associated feature point combinations according to the second euclidean distance value and the hamming distance value corresponding to each associated feature point combination includes:
calculating a second Euclidean distance value and a Hamming distance value between two feature points in each associated feature point combination;
determining a minimum Euclidean distance value R from all the second Euclidean distance values 1 And determining a minimum Hamming distance value R from all of said Hamming distance values 2
The second Euclidean distance value is not more than 2R 1 And Hamming distance value is not more than 2R 2 And carrying out cluster analysis on the associated characteristic point combination to obtain the optimal characteristic point combination.
According to a specific embodiment disclosed in the present application, the step of compensating the background region based on the transformation parameter to obtain background compensation data includes:
constructing a background region compensation model according to the transformation parameters, wherein the transformation parameters comprise at least one of translation amount, scaling amount, turnover amount, rotation amount and shearing amount;
and inputting the pixel point information contained in the background area into the background area compensation model to obtain the background compensation data.
According to a specific embodiment disclosed in the present application, the step of fusing the foreground compensation data and the background compensation data to obtain a target compensation image includes:
fusing the foreground compensation data and the background compensation data in different scales by using an image pyramid model to obtain compensation images in different scales;
and selecting a target compensation image from the compensation images with different scales according to a selection instruction triggered by a user, wherein the target compensation image is any one of the compensation images with all different scales.
In a second aspect, an embodiment of the present application provides an apparatus for video motion compensation, where the apparatus includes:
the extraction module is used for extracting foreground objects and background areas of two adjacent video frames in the video frame sequence;
the compensation module is used for obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames and obtaining background compensation data according to the pixel point coordinate transformation parameters of the background area of every two adjacent video frames;
and the fusion module is used for fusing the foreground compensation data and the background compensation data to obtain a target compensation image and inserting the target compensation image between two adjacent video frames.
In a third aspect, the present application provides a computer device, which includes a processor and a memory, where the memory stores a computer program, and the computer program implements the method of any one of the embodiments of the first aspect when executed on the processor.
In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program that, when executed on a processor, implements the method of any one of the embodiments of the first aspect.
Compared with the prior art, the method has the following beneficial effects:
the invention provides a video motion compensation method and device and computer equipment. The video motion compensation method comprises the following steps: the method comprises the steps of firstly extracting foreground targets and background areas of two adjacent video frames in a video frame sequence, then obtaining foreground compensation data according to pixel point information of the foreground targets of every two adjacent video frames, and obtaining the background compensation data according to pixel point coordinate transformation parameters of the background areas of every two adjacent video frames. And fusing the foreground compensation data and the background compensation data to obtain a target compensation image, and inserting the target compensation image between two adjacent video frames. According to the invention, the video frame is divided into the foreground target and the background area, and the foreground target and the background area are respectively subjected to motion compensation to obtain high-quality image motion compensation data, so that the data processing process of motion compensation is optimized, the conditions of video smear and jitter are reduced, and the use comfort of smart television users is greatly improved.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope of the present invention. Like components are numbered similarly in the various figures.
Fig. 1 is a flowchart illustrating a video motion compensation method according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an image pyramid model according to an embodiment of the present disclosure;
fig. 3 is a block diagram of a video motion compensation apparatus according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Hereinafter, the terms "including", "having", and their derivatives, which may be used in various embodiments of the present invention, are only intended to indicate specific features, numbers, steps, operations, elements, components, or combinations of the foregoing, and should not be construed as first excluding the existence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.
Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the present invention belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments of the present invention.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, a schematic flow chart of a video motion compensation method provided in an embodiment of the present application is shown in fig. 1, where the method mainly includes:
step S101, extracting foreground objects and background areas of two adjacent video frames in a video frame sequence.
In the past film age, the number of films displayed per second was about 24Hz, and if 24Hz, i.e. 24 pictures per second, were still displayed on a television or other terminal equipment, flicker would appear in the pictures. Therefore, the display frequency of the television is adjusted to 50Hz/60Hz, and for the movie shot at 24Hz, frames are inserted to make the picture smoother when the movie is displayed on the terminal equipment with higher display frequency.
Motion compensation is a motion picture quality compensation technique commonly used in lcd tv nowadays, that is, a frame of motion compensation frame is inserted between two frames of images. Therefore, the intermittent high-speed motion picture can be continuous and smooth.
The video motion compensation method extracts foreground objects and background areas of two adjacent video frames, and frames are intelligently and continuously inserted in the pictures through respectively estimating motion tracks of the foreground objects and the background areas, so that moving images are smooth, and the definition of the pictures is better. It should be noted that, the scheme provided by this embodiment may perform frame interpolation for compensation between every two adjacent video frames in the video, or may perform frame interpolation for compensation between partial adjacent video frames, especially between adjacent video frames related to a moving scene.
Extracting a foreground target of a video frame, wherein the step comprises the following steps;
and extracting the foreground target from the video frame through a pre-trained foreground extraction model, wherein the foreground extraction model is a multi-scale full-convolution neural network model.
In specific implementation, after extracting the foreground object of any video frame, the frame data of the rest of the video frame may be used as the background area, and the background area of the current video frame does not need to be additionally identified, and may also be extracted by other background extraction models, which is not limited herein.
Step S102, obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames, and obtaining background compensation data according to the pixel point coordinate transformation parameters of the background area of every two adjacent video frames.
For videos with background jitter and high-frequency moving objects, the conventional motion compensation technology cannot effectively overcome the background jitter to obtain high-quality compensation frame data, and the quality of video restoration is greatly influenced. For example, in a video of a competition of athletes at a competition field shot by a camera under the condition of shaking, the traditional global motion compensation parameters cannot effectively overcome shaking to obtain high-quality compensation data, so that the quality of the video is influenced, and the film watching experience of a user is greatly influenced.
The foreground compensation and the background compensation are to extract foreground objects and background areas of two adjacent video frames respectively, and to obtain the motion tracks of the foreground objects and the background areas of the two adjacent video frames through pixel point information in the video frames. And then calculating according to the motion trail to obtain pixel point information corresponding to the foreground target and the background area which need to be inserted between the adjacent video frames, namely respectively carrying out motion compensation of different degrees, thereby obtaining high-quality image motion compensation data.
In specific implementation, the step of obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames and obtaining background compensation data according to the pixel point coordinate transformation parameter of the background area of every two adjacent video frames includes:
acquiring pixel point information and a centroid coordinate of the foreground target, and compensating the foreground target by using the pixel point information and the centroid coordinate to obtain foreground compensation data;
and solving transformation parameters for affine transformation of the background area by using the characteristic point distance of the background area in every two adjacent video frames, and compensating the background area based on the transformation parameters to obtain background compensation data.
It should be noted that, in the scheme provided in this embodiment, there is no sequential limitation on the actions of obtaining the foreground compensation data and the background compensation data, and the actions may be processed simultaneously or sequentially.
The embodiment is further limited mainly for the acquisition process of foreground compensation data and background compensation data. The foreground compensation data mainly comprises a motion trail equation constructed by the edge pixel point information and the centroid coordinates of the foreground target, and the compensation data of the foreground target is generated by predicting through the motion trail equation. And the background compensation data is used for further calculating the motion vector of the background area through the affine change of the background area of the adjacent frames, so as to perform motion compensation on the background area in the current two video frames.
The transformation parameters refer to parameters for performing affine transformation on the background areas of every two adjacent video frames. Affine transformation is a linear transformation from two-dimensional coordinates to two-dimensional coordinates, maintaining the "straightness" and "parallelism" of a two-dimensional figure. The straightness is that straight lines or straight lines cannot be bent after transformation, and arcs or circular arcs, and the parallelism means that the relative position relation between two-dimensional graphs is kept unchanged, and parallel lines or parallel lines and the intersection angle of the intersected straight lines are unchanged. Affine transformations are geometric transformation models that best fit the changes between every two adjacent video frames.
In specific implementation, the pixel information includes coordinates and pixel values of each pixel, and the step of obtaining the pixel information and the centroid coordinates of the foreground object includes:
acquiring coordinates and pixel values of each pixel point of the foreground target;
calculating to obtain a centroid coordinate of the foreground target according to the coordinate and the pixel value of each pixel point; wherein the formula for calculating the centroid coordinate is:
Figure BDA0002947698390000101
Figure BDA0002947698390000102
wherein X is the coordinate of the mass center in the direction of the X axis, X i Coordinate, p, of ith pixel point of foreground object in x direction i Is the pixel value of the ith pixel point, Y is the coordinate of the mass center in the Y-axis direction, Y i And the coordinate of the ith pixel point in the y direction is obtained.
In particular, the center of mass of the image, also referred to as the center of gravity of the image. The concept of traditional centroids can be extended over images. The pixel value of each point in the image can be understood as the quality at that point, except that the image is 2-dimensional, and the solution is to find the centroid independently in the X-direction and the Y-direction, respectively. That is, for the centroid in the X direction, the pixel sums of the image on the left and right sides of the centroid are equal, and for the centroid in the Y direction, the pixel sums of the image on the upper and lower sides of the centroid are equal.
In specific implementation, the step of compensating the foreground target by using the pixel point information and the centroid coordinate to obtain foreground compensation data includes:
performing Euclidean transformation based on coordinates and centroid coordinates of all edge pixel points corresponding to foreground objects in the two adjacent video frames to obtain a first Euclidean distance value;
storing each first Euclidean distance value to a corresponding coordinate array, and calculating according to all the coordinate arrays to obtain an original motion track;
smoothing the original motion trail by adopting a filter to obtain a smooth motion trail model;
and inputting the pixel point information corresponding to the foreground target into the smooth motion track model to obtain the foreground compensation data of two adjacent video frames.
Specifically, the euclidean distance value refers to a real distance between two points in an m-dimensional space, or a natural length of a vector, i.e., a distance of the point from an origin. The euclidean distance in two and three dimensions is the actual distance between two points. When the Euclidean distance value is expanded to an n-dimensional space, the solving formula is as follows:
Figure BDA0002947698390000121
wherein d is an Euclidean distance value, x i Is the coordinate of the ith point in the coordinate system in the X-axis direction, y i Is the coordinate of the ith point in the coordinate system in the Y-axis direction.
In specific implementation, euclidean transformation is carried out on the basis of coordinates and centroid coordinates of all edge pixel points corresponding to foreground targets in two adjacent video frames, after first Euclidean distance values are obtained, each first Euclidean distance value is decomposed into a horizontal coordinate value, a vertical coordinate value and an angle value, and the horizontal coordinate value, the vertical coordinate value and the angle value are stored in corresponding coordinate arrays. And then differentiating and accumulating according to the array data to obtain an original motion trajectory curve, smoothing the original motion trajectory curve by using a filter, and filtering abnormal wave bands in the original motion trajectory curve to obtain a smooth motion trajectory model. The filtering by the filter can effectively suppress and prevent interference.
The step of solving the transformation parameters for performing affine transformation on the background area by using the feature point distance of the background area in every two adjacent video frames comprises the following steps:
selecting N non-overlapping rectangular regions in the background region as matching regions, wherein N is a positive integer;
detecting all feature points in each matching area;
matching the feature points of the two adjacent video frames to obtain a corresponding associated feature point combination between the two adjacent video frames, wherein the associated feature point combination comprises two feature points with the same feature value between the two adjacent video frames;
selecting an optimal feature point combination from all the associated feature point combinations according to a second Euclidean distance value and a Hamming distance value corresponding to each associated feature point combination;
and solving the transformation parameters for affine transformation of the optimal feature point combination between two adjacent video frames.
In specific implementation, harris corner detection and SURF key point detection can be adopted for each matching region to find all feature points in each matching region.
The hamming distance is the number of different characters at corresponding positions of two character strings with equal length, and d (x, y) can be used to represent the hamming distance between the character strings x and y. Viewed from another aspect, the hamming distance measures the minimum number of replacements required to change a string x to y by replacing a character. In other words, the hamming distance value is the number of characters that need to be replaced to convert one string to another. For example, the hamming distance between 1011101 and 1001001 is 2, the hamming distance between 2143896 and 2233796 is 3, and the hamming distance between "toned" and "roses" is 3.
The step of selecting an optimal feature point combination from all the associated feature point combinations according to the second euclidean distance value and the hamming distance value corresponding to each associated feature point combination includes:
calculating a second Euclidean distance value and a Hamming distance value between two feature points in each associated feature point combination;
determining a minimum Euclidean distance value R from all the second Euclidean distance values 1 And determining a minimum Hamming distance value R from all of the Hamming distance values 2
The second Euclidean distance value is not more than 2R 1 And Hamming distance value is not more than 2R 2 And performing cluster analysis on the associated feature point combinations to obtain the optimal feature point combination.
In specific implementation, for the associated feature point combinations between two adjacent video frames, the corresponding minimum distance values are respectively selected as threshold values, when the second euclidean distance and the hamming distance of the associated feature point combinations are both greater than two times of the corresponding preset threshold values, the corresponding associated feature point combinations are deleted, otherwise, the corresponding associated feature point combinations are retained. In addition, the two-time preset threshold in the screening condition is only an optimal value, and the two-time preset threshold in the screening condition can be reasonably set to be any multiple of preset value according to practical applicationA threshold value. Then, the second Euclidean distance value is not more than 2R 1 And Hamming distance value is not more than 2R 2 And performing cluster analysis on the associated feature point combination to obtain an optimal feature point combination, and solving a transformation parameter of affine transformation performed on the optimal feature point combination between two adjacent video frames.
The step of compensating the background region based on the transformation parameters to obtain background compensation data includes:
constructing a background region compensation model according to the transformation parameters, wherein the transformation parameters comprise at least one of translation amount, scaling amount, turnover amount, rotation amount and shearing amount;
and inputting the pixel point information contained in the background area into the background area compensation model to obtain the background compensation data.
Specifically, the whole background region compensation model is constructed based on the transformation parameters of affine transformation carried out by the optimal feature point combination, so that the calculated amount can be greatly reduced, and the optimal transformation effect can be achieved. The transformation parameters for affine transformation of the optimal feature point combination can be represented by the following transformation matrix:
Figure BDA0002947698390000141
wherein (t) x ,t y ) Representing the amount of translation, parameter a 1 、a 2 、a 3 And a 4 And the scale amount, the flip amount, the rotation amount and the shearing amount are expressed, x and y respectively express the abscissa and ordinate of the feature point before affine transformation, and x 'and y' respectively express the abscissa and ordinate of the feature point after affine transformation.
And step S103, fusing the foreground compensation data and the background compensation data to obtain a target compensation image, and inserting the target compensation image between two adjacent video frames.
In the conventional global motion compensation technology, the whole video frame is compensated uniformly, and the local image compensation of the video frame is unclear. In step S103, by distinguishing the foreground object and the background area of the video frame and performing motion compensation in different degrees, two high-quality compensation data can be obtained, video motion compensation and image restoration can be effectively completed, the processing process of the motion compensation data is optimized, the occurrence of video smear and jitter is reduced, and the use comfort of the smart television user is greatly improved.
The step of fusing the foreground compensation data and the background compensation data to obtain a target compensation image comprises:
fusing the foreground compensation data and the background compensation data in different scales by using an image pyramid model to obtain compensation images in different scales;
and selecting a target compensation image from the compensation images with different scales according to a selection instruction triggered by a user, wherein the target compensation image is any one of the compensation images with all different scales.
Referring to fig. 2, fig. 2 is a schematic diagram of a model structure of an image pyramid according to an embodiment of the present application. The image pyramid is a method for interpreting the structure of an image in multiple resolutions, and generates N images with different resolutions by performing multi-scale pixel sampling on an original image. The image with the highest level of resolution is placed at the bottom and arranged in a pyramid shape, which is a series of images with pixels gradually decreasing up to the top of the pyramid containing only one pixel. From the perspective of FIG. 2, the process of image resolution decreasing is represented by Level 0 → Level 1 → Level 2 → Level 3 → Level 4.
Fusing foreground compensation data and background compensation data with different resolutions by using an image pyramid model to obtain compensation images with different scales, and selecting a more real compensation image with a fusion effect which is most consistent with the original video visual effect from the compensation images, thereby simulating more real frame data and inserting the more real frame data into a video sequence.
According to the video motion compensation method provided by the invention, high-quality compensation frame data can be obtained by distinguishing the foreground and the background of the video and performing motion compensation in different degrees by adopting different methods, the video motion compensation and image restoration can be efficiently completed, the processing process of motion compensation data is optimized, and the situations of video smear and jitter are reduced.
In correspondence with the above method embodiment, referring to fig. 3, the present invention further provides a video motion compensation apparatus 300, where the video motion compensation apparatus 300 includes:
an extracting module 301, configured to extract a foreground object and a background area of two adjacent video frames in a video frame sequence;
the compensation module 302 is configured to obtain foreground compensation data according to pixel point information of the foreground object of each two adjacent video frames, and obtain background compensation data according to pixel point coordinate transformation parameters of the background area of each two adjacent video frames;
and a fusion module 303, configured to fuse the foreground compensation data and the background compensation data to obtain a target compensation image, and insert the target compensation image between two adjacent video frames.
Furthermore, a computer device is provided, the computer device comprising a processor and a memory, the memory storing a computer program, the computer program when executed on the processor implementing the above video motion compensation method.
Furthermore, a computer-readable storage medium is provided, in which a computer program is stored which, when executed on a processor, implements the above-described video motion compensation method.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, each functional module or unit in each embodiment of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present invention or a part of the technical solution that contributes to the prior art in essence can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a smart phone, a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.

Claims (11)

1. A method for video motion compensation, the method comprising:
extracting foreground objects and background areas of two adjacent video frames in a video frame sequence;
obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames, and obtaining background compensation data according to the pixel point coordinate transformation parameters of the background area of every two adjacent video frames;
fusing the foreground compensation data and the background compensation data to obtain a target compensation image, and inserting the target compensation image between two adjacent video frames;
the step of obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames and obtaining background compensation data according to the pixel point coordinate transformation parameter of the background area of every two adjacent video frames comprises the following steps:
acquiring pixel point information and a centroid coordinate of the foreground target, and compensating the foreground target by using the pixel point information and the centroid coordinate to obtain foreground compensation data;
and solving transformation parameters for affine transformation of the background area by using the characteristic point distance of the background area in every two adjacent video frames, and compensating the background area based on the transformation parameters to obtain background compensation data.
2. The method according to claim 1, wherein the step of extracting foreground objects of the video frame comprises;
and extracting the foreground target from the video frame through a pre-trained foreground extraction model, wherein the foreground extraction model is a multi-scale full-convolution neural network model.
3. The method according to claim 1, wherein the pixel information includes coordinates and pixel values of each pixel, and the step of obtaining the pixel information and centroid coordinates of the foreground object includes:
obtaining coordinates and pixel values of each pixel point of the foreground target;
calculating to obtain a centroid coordinate of the foreground target according to the coordinate and the pixel value of each pixel point; wherein the formula for calculating the coordinates of the center of mass is:
Figure F_220824105744686_686817001
Figure F_220824105744767_767374002
wherein X is the coordinate of the mass center in the direction of the X axis,
Figure F_220824105744845_845511003
is the first of the foreground object
Figure F_220824105744923_923631004
The coordinates of the individual pixel points in the x-direction,
Figure F_220824105745001_001752005
is the first
Figure F_220824105745112_112566006
The pixel value of each pixel point, Y is the coordinate of the mass center in the Y-axis direction,
Figure F_220824105745191_191207007
is the first
Figure F_220824105745253_253729008
The coordinates of each pixel point in the y direction.
4. The method of claim 1, wherein the step of compensating the foreground object by using the pixel point information and the centroid coordinates to obtain foreground compensation data comprises:
performing Euclidean transformation on the basis of coordinates and centroid coordinates of all edge pixel points corresponding to foreground objects in the two adjacent video frames to obtain a first Euclidean distance value;
storing each first Euclidean distance value to a corresponding coordinate array, and calculating according to all the coordinate arrays to obtain an original motion track;
smoothing the original motion trail by adopting a filter to obtain a smooth motion trail model;
and inputting the pixel point information corresponding to the foreground target into the smooth motion track model to obtain the foreground compensation data of two adjacent video frames.
5. The method according to claim 1, wherein the step of solving the transformation parameters for affine transformation of the background area by using the feature point distance of the background area in each two adjacent video frames comprises:
selecting N non-overlapping rectangular regions in the background region as matching regions, wherein N is a positive integer;
detecting all feature points in each matching region;
matching the feature points of the two adjacent video frames to obtain a corresponding associated feature point combination between the two adjacent video frames, wherein the associated feature point combination comprises two feature points with the same feature value between the two adjacent video frames;
selecting an optimal feature point combination from all the associated feature point combinations according to a second Euclidean distance value and a Hamming distance value corresponding to each associated feature point combination;
and solving the transformation parameters for affine transformation of the optimal characteristic point combination between two adjacent video frames.
6. The method according to claim 5, wherein the step of selecting an optimal feature point combination from all the associated feature point combinations according to the second Euclidean distance value and the Hamming distance value corresponding to each associated feature point combination comprises:
calculating a second Euclidean distance value and a Hamming distance value between two feature points in each associated feature point combination;
determining a minimum Euclidean distance value R from all the second Euclidean distance values 1 And determining a minimum Hamming distance value R from all of said Hamming distance values 2
The second Euclidean distance value is not more than 2R 1 And Hamming distance value is not more than 2R 2 And carrying out cluster analysis on the associated characteristic point combination to obtain the optimal characteristic point combination.
7. The method of claim 1, wherein the step of compensating the background region based on the transformation parameters to obtain background compensation data comprises:
constructing a background region compensation model according to the transformation parameters, wherein the transformation parameters comprise at least one of translation amount, scaling amount, turnover amount, rotation amount and shearing amount;
and inputting the pixel point information contained in the background area into the background area compensation model to obtain the background compensation data.
8. The method according to claim 1, wherein the step of fusing the foreground compensation data and the background compensation data to obtain a target compensation image comprises:
fusing the foreground compensation data and the background compensation data in different scales by using an image pyramid model to obtain compensation images in different scales;
and selecting a target compensation image from the compensation images with different scales according to a selection instruction triggered by a user, wherein the target compensation image is any one of all the compensation images with different scales.
9. An apparatus for video motion compensation, the apparatus comprising:
the extraction module is used for extracting foreground objects and background areas of two adjacent video frames in the video frame sequence;
the compensation module is used for obtaining foreground compensation data according to the pixel point information of the foreground target of every two adjacent video frames and obtaining background compensation data according to the pixel point coordinate transformation parameters of the background area of every two adjacent video frames; the foreground target compensation method specifically comprises the steps of obtaining pixel point information and a centroid coordinate of the foreground target, and compensating the foreground target by using the pixel point information and the centroid coordinate to obtain foreground compensation data; solving transformation parameters for affine transformation of the background area by using the characteristic point distance of the background area in every two adjacent video frames, and compensating the background area based on the transformation parameters to obtain background compensation data;
and the fusion module is used for fusing the foreground compensation data and the background compensation data to obtain a target compensation image and inserting the target compensation image between two adjacent video frames.
10. A computer device, characterized in that it comprises a processor and a memory, said memory storing a computer program which, when executed on said processor, implements the video motion compensation method of any one of claims 1 to 8.
11. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed on a processor, implements the video motion compensation method of any of claims 1 to 8.
CN202110199719.2A 2021-02-22 2021-02-22 Video motion compensation method and device and computer equipment Active CN112995678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110199719.2A CN112995678B (en) 2021-02-22 2021-02-22 Video motion compensation method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110199719.2A CN112995678B (en) 2021-02-22 2021-02-22 Video motion compensation method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN112995678A CN112995678A (en) 2021-06-18
CN112995678B true CN112995678B (en) 2022-10-25

Family

ID=76349536

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110199719.2A Active CN112995678B (en) 2021-02-22 2021-02-22 Video motion compensation method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN112995678B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114079725B (en) * 2020-08-13 2023-02-07 华为技术有限公司 Video anti-shake method, terminal device, and computer-readable storage medium
CN114205672A (en) * 2021-12-13 2022-03-18 浙江湖州兆龙网络科技有限公司 Video compression processor of motion compensation technology
CN114339395A (en) * 2021-12-14 2022-04-12 浙江大华技术股份有限公司 Video jitter detection method, detection device, electronic equipment and readable storage medium
CN115297313B (en) * 2022-10-09 2023-04-25 南京芯视元电子有限公司 Micro display dynamic compensation method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003032688A (en) * 2001-07-18 2003-01-31 Nippon Telegr & Teleph Corp <Ntt> Separation method of foreground and background regions for moving image, and moving image coding method by conditional pixel replenishment by using this method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101478678B (en) * 2008-12-30 2011-06-01 西安交通大学 Time-domain filtering method based on interested region motion compensation
CN101621693B (en) * 2009-07-31 2011-01-05 重庆大学 Frame frequency lifting method for combining target partition and irregular block compensation
CN103167304B (en) * 2013-03-07 2015-01-21 海信集团有限公司 Method and device for improving a stereoscopic video frame rates
CN107968946B (en) * 2016-10-18 2021-09-21 深圳万兴信息科技股份有限公司 Video frame rate improving method and device
CN110825123A (en) * 2019-10-21 2020-02-21 哈尔滨理工大学 Control system and method for automatic following loading vehicle based on motion algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003032688A (en) * 2001-07-18 2003-01-31 Nippon Telegr & Teleph Corp <Ntt> Separation method of foreground and background regions for moving image, and moving image coding method by conditional pixel replenishment by using this method

Also Published As

Publication number Publication date
CN112995678A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN112995678B (en) Video motion compensation method and device and computer equipment
CN110176027B (en) Video target tracking method, device, equipment and storage medium
Lebreton et al. GBVS360, BMS360, ProSal: Extending existing saliency prediction models from 2D to omnidirectional images
US11095833B2 (en) Automatic composition of composite images or videos from frames captured with moving camera
US9430817B2 (en) Blind image deblurring with cascade architecture
CN111951325B (en) Pose tracking method, pose tracking device and electronic equipment
CN111179159B (en) Method and device for eliminating target image in video, electronic equipment and storage medium
Koh et al. Video stabilization based on feature trajectory augmentation and selection and robust mesh grid warping
CN110223236B (en) Method for enhancing image sequences
Li et al. A maximum a posteriori estimation framework for robust high dynamic range video synthesis
Wu et al. Global motion estimation with iterative optimization-based independent univariate model for action recognition
CN113989460B (en) Real-time sky replacement special effect control method and device for augmented reality scene
CN109285122A (en) A kind of method and apparatus carrying out image procossing
Li et al. Video retargeting with multi-scale trajectory optimization
CN110309721A (en) Method for processing video frequency, terminal and storage medium
Qiao et al. Temporal coherence-based deblurring using non-uniform motion optimization
Calagari et al. Data driven 2-D-to-3-D video conversion for soccer
Lee Novel video stabilization for real-time optical character recognition applications
CN111988520B (en) Picture switching method and device, electronic equipment and storage medium
CN114387290A (en) Image processing method, image processing apparatus, computer device, and storage medium
Yu et al. Animation line art colorization based on the optical flow method
CN113516674A (en) Image data detection method and device, computer equipment and storage medium
Zhen et al. Inertial sensor aided multi-image nonuniform motion blur removal based on motion decomposition
CN117456097B (en) Three-dimensional model construction method and device
CN113962964B (en) Specified object erasing method and device based on time sequence image data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240810

Address after: 518000, A1306, Skyworth Building, No. 008 Gaoxin South 1st Road, Gaoxin Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Skyworth Display Technology Co.,Ltd.

Country or region after: China

Address before: 518000 13-16 / F, block a, South Skyworth building, Shennan Avenue, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province (office only)

Patentee before: SHENZHEN SKYWORTH-RGB ELECTRONIC Co.,Ltd.

Country or region before: China

Patentee before: NANJING SKYWORTH INSTITUTE OF INFORMATION TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right