CN115209160A - Video compression method, electronic device and readable storage medium - Google Patents

Video compression method, electronic device and readable storage medium Download PDF

Info

Publication number
CN115209160A
CN115209160A CN202210663387.3A CN202210663387A CN115209160A CN 115209160 A CN115209160 A CN 115209160A CN 202210663387 A CN202210663387 A CN 202210663387A CN 115209160 A CN115209160 A CN 115209160A
Authority
CN
China
Prior art keywords
video
sampling
frame
sampled
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210663387.3A
Other languages
Chinese (zh)
Inventor
王荣刚
杨佳宇
王振宇
高文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN202210663387.3A priority Critical patent/CN115209160A/en
Priority to PCT/CN2022/121439 priority patent/WO2023240835A1/en
Publication of CN115209160A publication Critical patent/CN115209160A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses a video compression method, electronic equipment and a readable storage medium, which are applied to the technical field of image processing, wherein the video compression method comprises the following steps: acquiring a video to be compressed, and performing down-sampling on the video to be compressed to obtain a down-sampled video; coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video; and performing up-sampling on the down-sampled compressed video to obtain a target video. The application solves the technical problem of low compression efficiency of video compression.

Description

Video compression method, electronic device and readable storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a video compression method, an electronic device, and a readable storage medium.
Background
With the rapid development of science and technology, image processing technology is developed more and more mature, at present, video encoding and decoding are performed through a compression algorithm, due to the fact that functional optimization limitations of an encoding framework and an encoding tool are high, when video encoding and decoding are performed, if the video quality after video compression needs to be guaranteed, namely video characteristics are lost less, the code rate needing to be compressed in the encoding and decoding process is high, the efficiency of video compression is low, and if the compression efficiency of video compression needs to be guaranteed, namely the code rate of video compression is reduced, the loss of more characteristics after video encoding and decoding is prone to occur, the compression quality of video compression is low, and therefore the compression efficiency of video compression is low.
Disclosure of Invention
The present application mainly aims to provide a video compression method, an electronic device, and a readable storage medium, and aims to solve the technical problem in the prior art that the compression efficiency of video compression is low.
In order to achieve the above object, the present application provides a video compression method applied to a video compression device, the video compression method including:
acquiring a video to be compressed, and performing down-sampling on the video to be compressed to obtain a down-sampled video;
coding and decoding each picture frame in the down-sampling video to compress the down-sampling video to obtain a down-sampling compressed video;
and performing up-sampling on the down-sampled compressed video to obtain a target video.
To achieve the above object, the present application further provides a video compression apparatus applied to a video compression device, the video compression apparatus including:
the down-sampling module is used for acquiring a video to be compressed and down-sampling the video to be compressed to obtain a down-sampled video;
the coding and decoding module is used for coding and decoding each picture frame in the down-sampled video so as to compress the down-sampled video and obtain a down-sampled compressed video;
and the up-sampling module is used for up-sampling the down-sampled compressed video to obtain a target video.
The present application further provides an electronic device, including: a memory, a processor and a program of the video compression method stored on the memory and executable on the processor, which program, when executed by the processor, is operable to implement the steps of the video compression method as described above.
The present application also provides a computer-readable storage medium having stored thereon a program for implementing a video compression method, which when executed by a processor implements the steps of the video compression method as described above.
The present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the video compression method as described above.
Compared with video coding and decoding through a compression algorithm, the video compression method, the electronic equipment and the readable storage medium have the advantages that a video to be compressed is obtained, and down-sampling is carried out on the video to be compressed to obtain a down-sampled video; coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video; the method comprises the steps of up-sampling a down-sampled compressed video to obtain a target video, down-sampling the video to be compressed to reduce the code rate of encoding and decoding, up-sampling the down-sampled compressed video to ensure that fewer features are lost after video compression, and avoiding the technical defect that loss of more features after video encoding and decoding leads to low compression quality of video compression if the compression efficiency of video compression needs to be ensured when a compression algorithm is adopted to perform encoding and decoding of the video, so that the compression efficiency of video compression is improved under the condition of ensuring the compression quality of video compression.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive labor.
FIG. 1 is a schematic flow chart of a first embodiment of a video compression method according to the present application;
fig. 2 is a schematic flow chart of time-space domain down-sampling and time-space domain up-sampling according to an embodiment of the video compression method of the present application;
fig. 3 is a schematic device structure diagram of a hardware operating environment related to a video compression method in an embodiment of the present application.
The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying figures are described in detail below. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
In a first embodiment of the video compression method according to the present application, with reference to fig. 1, the video compression method includes:
step S10, acquiring a video to be compressed, and performing down-sampling on the video to be compressed to obtain a down-sampled video;
step S20, coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video;
and step S30, performing up-sampling on the down-sampled compressed video to obtain a target video.
Exemplarily, steps S10 to S30 include: acquiring a video to be compressed, and performing time domain and/or space domain down-sampling on the video to be compressed to obtain a down-sampled video; coding the down-sampling video through an image coder to obtain a down-sampling coded video, and decoding the down-sampling coded video through an image decoder to obtain a down-sampling compressed video; and performing up-sampling on the down-sampled compressed video to obtain a target video.
Optionally, only down-sampling in the time domain may be performed, and up-sampling in the time domain may be correspondingly performed; or only down-sampling in the spatial domain and up-sampling in the spatial domain correspondingly; and the time domain and the space domain can be downsampled correspondingly, and the time domain and the space domain are upsampled correspondingly, because the difference of the three modes on the characteristic loss condition after video compression is not large, in order to improve the compression efficiency of the video compression, the time domain and the space domain are downsampled preferentially, and the time domain and the space domain are upsampled correspondingly.
Optionally, as an example, referring to fig. 2, the coded video is illustrated as the video to be compressed, vfi _ flag is a spatial downsampling flag bit corresponding to the coded video, vsr _ flag is a temporal downsampling flag bit corresponding to the coded video, the coded video is subjected to temporal downsampling and spatial downsampling to obtain a downsampled coded video, a spatial downsampling flag bit corresponding to the downsampled coded video, and a temporal downsampling flag bit corresponding to the downsampled coded video, the downsampled coded video is coded and decoded to obtain a downsampled decoded video, and the downsampled decoded video is subjected to temporal upsampling and spatial upsampling according to the temporal downsampling flag bit and the spatial downsampling flag bit to obtain a decoded video.
In step S20, the encoding and decoding of each picture frame in the downsampled video to compress the downsampled video to obtain a downsampled compressed video includes:
step S21, coding and decoding each picture frame in the down-sampling video to obtain at least one down-sampling picture frame, and taking the decoded picture frame of each down-sampling picture frame at the previous time step as a down-sampling decoding reference frame corresponding to the down-sampling picture frame;
step S22, determining a prediction frame of each down-sampling picture frame according to the down-sampling picture frame and the corresponding down-sampling decoding reference frame;
and step S23, determining a decoding reconstruction frame corresponding to each downsampling picture frame according to the prediction frame and the downsampling picture frame to obtain the downsampling compressed video.
Exemplarily, steps S21 to S23 include: grouping each picture frame in the downsampled video according to preset grouping information and first time sequence information corresponding to each picture frame in the downsampled video to obtain downsampled picture frames and picture frames of the downsampled picture frames at the last time step, coding the picture frames of the downsampled picture frames at the last time step through an image coder to obtain downsampled coding reference frames, decoding the downsampled coding reference frames through an image decoder to obtain downsampled decoding reference frames, or pulling the decoded picture frames of the downsampled picture frames at the last time step from a reference queue buffer area and taking the decoded picture frames as the downsampled decoding reference frames corresponding to the downsampled picture frames; determining a predicted frame of each down-sampled picture frame according to relative motion information between the down-sampled picture frame and a corresponding down-sampled decoded reference frame; determining a decoded reconstructed frame corresponding to each downsampled picture frame according to inter-frame information between the predicted frame and the downsampled picture frame, and arranging each decoded reconstructed frame according to second timing information corresponding to the decoded reconstructed frame to obtain the downsampled compressed video, wherein the preset grouping information may be a preset first number of downsampled picture frames for dividing picture frames in the downsampled video into downsampled picture frames and picture frames of the downsampled picture frames at a previous time step, and a second number of picture frames of the downsampled picture frames at the previous time step, for example, when the preset grouping information is (8978 zx8978) and picture frames in the downsampled video are 8 frames, a fourth frame is used as a first downsampled picture frame, first to third frames are used as picture frames at the previous time step corresponding to the first downsampled picture frame, an eighth frame is used as a second downsampled picture frame, and fifth to seventh frames are used as picture frames at a previous time step corresponding to the second downsampled picture frame.
In step S22, the step of determining a predicted frame of each of the down-sampled picture frames according to the down-sampled picture frame and the corresponding down-sampled decoded reference frame includes:
step A10, determining first relative motion data between each down-sampled picture frame and a down-sampled decoded reference frame corresponding to each down-sampled picture frame;
and step A20, predicting each down-sampled picture frame according to the inter-frame feature vector in the first relative motion data and a preset inter-frame prediction model to obtain the prediction frame.
In this embodiment, it should be noted that the preset inter-frame prediction model is a preset model for predicting the downsampled picture frame, and the preset inter-frame prediction model may be a convolutional neural network, a cyclic neural network, or a deformable convolutional network.
Exemplarily, the steps a10 to a20 include: obtaining the first relative motion data according to the downsampling picture frame and first motion information of each pixel point in a downsampling decoding reference frame corresponding to the downsampling picture frame, wherein the first relative motion data at least comprises one of motion speed, motion acceleration and position information of each pixel point in the downsampling decoding reference frame corresponding to the downsampling picture frame, and the position information can be coordinate information or distance information; and performing feature extraction on the first relative motion data to obtain inter-frame feature vectors corresponding to the down-sampling picture frames, inputting the inter-frame feature vectors into the preset inter-frame prediction model, and mapping the inter-frame feature vectors into the picture frames of the down-sampling picture frames through the preset inter-frame prediction model to obtain the prediction frames.
The method and the device have the advantages that the plurality of the down-sampled picture frames are predicted through the deformable convolution network, the technical defects that when the down-sampled picture frames are predicted through the optical flow, only a single down-sampled picture frame can be predicted, and when parameters such as the brightness of a video are changed, the prediction accuracy is low easily occur are overcome, and therefore the prediction accuracy and the prediction efficiency of inter-frame prediction are improved.
The method has the advantages that the motion offset prediction with diversity in the deformable convolution network is carried out, the technical defect that the accuracy of the motion offset prediction is low due to the fact that single prediction is easy to occur when the light stream is adopted for the motion offset prediction is avoided, and therefore the accuracy of the motion offset prediction is improved, namely the prediction accuracy of inter-frame prediction is improved.
In step S23, the step of determining, according to the predicted frame and the downsampled picture frame, a decoded reconstructed frame corresponding to each downsampled picture frame includes:
step B10, determining the frame difference between each down-sampling picture frame and the corresponding prediction frame to obtain the prediction residual error corresponding to each down-sampling picture frame;
and step B20, determining a decoded reconstructed frame corresponding to each down-sampling picture frame according to the residual decoded value corresponding to the predicted residual and the predicted frame.
Exemplarily, the steps B10 to B20 include: performing frame difference calculation on each down-sampled picture frame and the corresponding prediction frame to obtain a prediction residual error corresponding to the down-sampled picture frame; and coding and decoding the prediction residual by a residual coder-decoder to obtain a residual decoding value corresponding to the prediction residual, aggregating the residual decoding value and the prediction frame to obtain the decoded reconstruction frame, and adding the decoded reconstruction frame to a decoded image buffer area.
Compared with video coding and decoding through a compression algorithm, the video compression method provided by the embodiment of the application performs downsampling on the video to be compressed by acquiring the video to be compressed to obtain downsampled video; coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video; the method comprises the steps of up-sampling a down-sampled compressed video to obtain a target video, down-sampling the video to be compressed to reduce the code rate of encoding and decoding, up-sampling the down-sampled compressed video to ensure that fewer features are lost after video compression, and avoiding the technical defect that loss of more features after video encoding and decoding leads to low compression quality of video compression if the compression efficiency of video compression needs to be ensured when a compression algorithm is adopted to perform encoding and decoding of the video, so that the compression efficiency of video compression is improved under the condition of ensuring the compression quality of video compression.
Example two
Further, based on the first embodiment of the present application, in another embodiment of the present application, the same or similar contents to the first embodiment described above may be referred to the above description, and are not repeated herein. On this basis, in step S10, the down-sampling the video to be compressed to obtain a down-sampled video includes:
s11, performing time domain down-sampling on the video to be compressed to obtain a time domain down-sampled video, and performing space domain down-sampling on the time domain down-sampled video to obtain a down-sampled video; and/or the presence of a gas in the gas,
and S12, performing space domain down-sampling on the video to be compressed to obtain a space domain down-sampled video, and performing time domain down-sampling on the space domain down-sampled video to obtain the down-sampled video.
Exemplarily, steps S11 to S12 include: performing time domain down-sampling on the video to be compressed through time domain frame extraction to obtain a time domain down-sampled video and a time domain down-sampled flag bit corresponding to the time domain down-sampled video, wherein the time domain frame extraction mode can be that the video is extracted once every preset inter-frame interval, for example, every three frames, or every step frame, for example, the frames before and after the step phenomenon is extracted, or every preset time interval, for example, every 2 seconds; performing spatial domain downsampling on the time domain downsampled video through pixel interpolation corresponding to the time domain downsampled video to obtain the downsampled video and a spatial domain downsampled flag bit corresponding to the downsampled video, wherein a determination method of the pixel interpolation can be a bilinear interpolation method, can also be a nearest interpolation method, and can also be a bicubic interpolation method, when the nearest interpolation method is used for determining the pixel interpolation, a pixel value closest to an interpolation point is directly used as the pixel interpolation, and a situation that a difference between video display contents of a spatial domain downsampled processing video and a video to be compressed is large easily occurs, so to solve the problem, the bilinear interpolation method can be used for determining the pixel interpolation, and at the moment, the efficiency and the effect of determining the pixel interpolation are good, and when the bicubic interpolation method determines the pixel interpolation, because sampling points are more than the bilinear interpolation method, the obtained pixel interpolation is smoother and has fewer sawteeth, and therefore, the bicubic interpolation method is preferably used for determining; and/or, through the pixel interpolation corresponding to the video to be compressed, carrying out spatial domain down-sampling on the video to be compressed to obtain a spatial domain down-sampled video and a spatial domain down-sampled flag bit corresponding to the spatial domain down-sampled video, and carrying out temporal domain down-sampling on the spatial domain down-sampled video through temporal frame extraction to obtain the down-sampled video and a temporal domain down-sampled flag bit corresponding to the down-sampled video.
In step S30, the step of upsampling the downsampled compressed video to obtain a target video includes:
step S31, according to the time domain down-sampling flag bit corresponding to the down-sampled compressed video, performing time domain up-sampling on the down-sampled compressed video to obtain a time domain up-sampled video, and according to the space domain down-sampling flag bit corresponding to the down-sampled compressed video, performing space domain up-sampling on the time domain up-sampled video to obtain the target video; and/or the presence of a gas in the gas,
and S32, according to the airspace down-sampling zone bit corresponding to the down-sampling compressed video, performing airspace up-sampling on the down-sampling compressed video to obtain an airspace up-sampling video, and according to the time domain down-sampling zone bit corresponding to the down-sampling compressed video, performing time domain up-sampling on the airspace up-sampling video to obtain the target video.
Exemplarily, steps S31 to S32 include: determining an extracted target time domain frame according to the time domain down-sampling flag bit, and performing time domain up-sampling on the down-sampled compressed video according to the target time domain frame to obtain a time domain up-sampled video; determining an extracted target space domain frame according to the space domain down-sampling zone bit, and performing space domain up-sampling on the time domain up-sampled video according to the target space domain frame to obtain the target video; and/or determining an extracted target time domain frame according to the spatial domain down-sampling flag bit, and performing spatial domain up-sampling on the down-sampled compressed video according to the target spatial domain frame to obtain a spatial domain up-sampled video; and determining an extracted target time domain frame according to the time domain down-sampling zone bit, and performing time domain up-sampling on the spatial domain up-sampled video according to the target time domain frame to obtain the target video, wherein the sequence of the time domain up-sampling and the spatial domain up-sampling is consistent with the sequence of the time domain down-sampling and the spatial domain down-sampling.
In step S31, the up-sampling the down-sampled compressed video according to the time domain down-sampling flag bit corresponding to the down-sampled compressed video to obtain a time domain up-sampled video includes:
step C10, predicting first motion offset data between the down-sampled decoding reference frames according to second relative motion data between the down-sampled decoding reference frames corresponding to the down-sampled picture frames in the down-sampled compressed video;
step C20, according to the first motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each down-sampling decoding reference frame to obtain at least one first alignment frame;
step C30, performing information fusion on each first alignment frame to obtain multi-frame fusion information;
and step C40, determining the time domain up-sampling video according to the multi-frame aggregation information and the time domain sampling flag bit.
Exemplarily, the steps C10 to C40 include: determining second relative motion data among the down-sampling decoding reference frames according to second motion information of each pixel point in the down-sampling decoding reference frames corresponding to each down-sampling picture frame in the down-sampling compressed video, wherein the second relative motion data at least comprises one of motion speed, motion acceleration and position information of each pixel point in each down-sampling decoding reference frame, predicting first motion offset data among the down-sampling decoding reference frames according to the second relative motion data and a preset motion offset prediction model, extracting features in the second relative motion data to obtain first motion offset features, and mapping the first motion offset features into the first motion offset data among the down-sampling decoding reference frames through the preset motion offset prediction model; inputting the first motion offset data into a preset inter-frame alignment model, and mapping the first motion offset data into at least one first alignment frame through the preset inter-frame alignment model; performing information aggregation on each first alignment frame according to the channel dimension corresponding to the first alignment frame to obtain multi-frame aggregation information; and determining the time-domain up-sampling video according to the multi-frame aggregation information and the time-domain sampling flag bit.
In step S31, the spatial upsampling the time-domain upsampled video according to the spatial downsampling flag bit corresponding to the downsampled compressed video to obtain the target video includes:
step D10, predicting second motion offset data between the down-sampled picture frame and each down-sampled decoded reference frame according to third relative motion data between the down-sampled decoded reference frame and the down-sampled picture frame in the time-domain up-sampled video;
step D20, according to the second motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each downsampling decoding reference frame to obtain at least one second pair Ji Zhen;
step D30, performing information fusion on each second alignment frame to obtain multi-frame fusion information;
and D40, determining the target video according to the multi-frame fusion information and the airspace down-sampling zone bit.
Exemplarily, the steps D10 to D40 include: determining third relative motion data between the down-sampling decoding reference frames according to third motion information of each pixel point in the down-sampling decoding reference frames and the down-sampling picture frames in the time domain up-sampling video, wherein the third relative motion data at least comprises one of motion speed, motion acceleration and position information of each pixel point in the down-sampling decoding reference frames and the down-sampling picture frames, and predicting second motion offset data between the down-sampling picture frames and each down-sampling decoding reference frame according to the third relative motion data and a preset motion offset prediction model; inputting the second motion offset data into a preset inter-frame alignment model, and mapping the second motion offset data into at least one second pair Ji Zhen through the preset inter-frame alignment model; performing information fusion on each second alignment frame according to the channel dimension corresponding to the second alignment frame to obtain multi-frame fusion information; and according to the airspace down-sampling zone bit, performing pixel turnover on the multi-frame fusion information to obtain the target video.
In the embodiment, the preset inter-frame alignment model is preferably a deformable convolution network, and the motion offset prediction with diversity of the deformable convolution network avoids the technical defect that the accuracy of the motion offset prediction is low due to the fact that single prediction is easy to occur when the optical flow is adopted for the motion offset prediction, so that the accuracy of the motion offset prediction is improved, namely the prediction accuracy of the inter-frame prediction is improved.
Compared with the video coding and decoding through a compression algorithm, the video compression method has the advantages that the video to be compressed is obtained, and the down-sampling is carried out on the video to be compressed to obtain the down-sampled video; coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video; the method comprises the steps of up-sampling the down-sampled compressed video to obtain a target video, down-sampling the video to be compressed to reduce the coding rate of encoding and decoding, up-sampling the down-sampled compressed video to ensure that fewer characteristics are lost after video compression, and avoiding the technical defect that when a compression algorithm is adopted to encode and decode the video, if the compression efficiency of the video compression needs to be ensured, the loss of more characteristics after the video encoding and decoding is easy to cause the low compression quality of the video compression, so that the compression efficiency of the video compression is improved under the condition of ensuring the compression quality of the video compression.
EXAMPLE III
An embodiment of the present application further provides a video compression apparatus, where the video compression apparatus is applied to a video compression device, and the video compression apparatus includes:
the down-sampling module is used for acquiring a video to be compressed and down-sampling the video to be compressed to obtain a down-sampled video;
the coding and decoding module is used for coding and decoding each picture frame in the down-sampling video so as to compress the down-sampling video and obtain a down-sampling compressed video;
and the up-sampling module is used for up-sampling the down-sampled compressed video to obtain a target video.
Optionally, the down-sampling module is further configured to:
performing time domain down-sampling on the video to be compressed to obtain a time domain down-sampled video, and performing space domain down-sampling on the time domain down-sampled video to obtain a down-sampled video; and/or the presence of a gas in the gas,
and performing space domain downsampling on the video to be compressed to obtain a space domain downsampled video, and performing time domain downsampling on the space domain downsampled video to obtain the downsampled video.
Optionally, the encoding and decoding module is further configured to:
coding and decoding each picture frame in the down-sampling video to obtain at least one down-sampling picture frame, and taking the decoded picture frame of each down-sampling picture frame at the previous time step as a down-sampling decoding reference frame corresponding to the down-sampling picture frame;
determining a predicted frame of each of the down-sampled picture frames according to the down-sampled picture frame and a corresponding down-sampled decoded reference frame;
and determining a decoding reconstruction frame corresponding to each down-sampling picture frame according to the prediction frame and the down-sampling picture frame to obtain the down-sampling compressed video.
Optionally, the encoding and decoding module is further configured to:
determining first relative motion data between each of the down-sampled picture frames and a down-sampled decoded reference frame corresponding to each of the down-sampled picture frames;
and predicting each down-sampling picture frame according to the inter-frame feature vector in the first relative motion data and a preset inter-frame prediction model to obtain the prediction frame.
Optionally, the encoding and decoding module is further configured to:
determining the frame difference between each down-sampling picture frame and the corresponding prediction frame to obtain the prediction residual error corresponding to each down-sampling picture frame;
and determining a decoding reconstruction frame corresponding to each down-sampling picture frame according to the residual decoding value corresponding to the prediction residual and the prediction frame.
Optionally, the upsampling module is further configured to:
according to the time domain down-sampling flag bit corresponding to the down-sampled compressed video, performing time domain up-sampling on the down-sampled compressed video to obtain a time domain up-sampled video, and according to the space domain down-sampling flag bit corresponding to the down-sampled compressed video, performing space domain up-sampling on the time domain up-sampled video to obtain the target video; and/or the presence of a gas in the atmosphere,
according to the space domain down-sampling flag bit corresponding to the down-sampled compressed video, performing space domain up-sampling on the down-sampled compressed video to obtain space domain up-sampled video, and according to the time domain down-sampling flag bit corresponding to the down-sampled compressed video, performing time domain up-sampling on the space domain up-sampled video to obtain the target video.
Optionally, the upsampling module is further configured to:
predicting first motion offset data between the down-sampling decoding reference frames according to second relative motion data between the down-sampling decoding reference frames corresponding to the down-sampling picture frames in the down-sampling compressed video;
according to the first motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each down-sampling decoding reference frame to obtain at least one first alignment frame;
performing information aggregation on each first aligned frame to obtain multi-frame aggregation information;
and determining the time domain up-sampling video according to the multi-frame aggregation information and the time domain sampling flag bit.
Optionally, the upsampling module is further configured to:
predicting second motion offset data between the downsampled picture frame and each downsampled decoded reference frame according to third relative motion data between the downsampled decoded reference frame and the downsampled picture frame in the time-domain upsampled video;
according to the second motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each downsampling decoding reference frame to obtain at least one second pair Ji Zhen;
performing information fusion on each second alignment frame to obtain multi-frame fusion information;
and determining the target video according to the multi-frame fusion information and the airspace down-sampling zone bit.
The video compression device provided by the application adopts the video compression method in the embodiment, and the technical problem of low compression efficiency of video compression is solved. Compared with the prior art, the beneficial effects of the video compression apparatus provided by the embodiment of the present application are the same as the beneficial effects of the video compression method provided by the above embodiment, and other technical features of the video compression apparatus are the same as the features disclosed in the above embodiment method, which are not repeated herein.
Example four
An embodiment of the present application provides an electronic device, which includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video compression method of the above embodiments.
Referring now to FIG. 3, shown is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 3, the electronic device may include a processing apparatus (e.g., a central processing unit, a graphic processor, etc.) that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage apparatus into a Random Access Memory (RAM). In the RAM, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device, the ROM, and the RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
Generally, the following systems may be connected to the I/O interface: input devices including, for example, touch screens, touch pads, keyboards, mice, image sensors, microphones, accelerometers, gyroscopes, and the like; output devices including, for example, liquid Crystal Displays (LCDs), speakers, vibrators, and the like; storage devices including, for example, magnetic tape, hard disk, etc.; and a communication device. The communication means may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While the figures illustrate an electronic device with various systems, it is to be understood that not all illustrated systems are required to be implemented or provided. More or fewer systems may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means, or installed from a storage means, or installed from a ROM. The computer program, when executed by a processing device, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
The electronic device provided by the application adopts the video compression method in the embodiment, so that the technical problem of low compression efficiency of video compression is solved. Compared with the prior art, the beneficial effects of the electronic device provided by the embodiment of the present application are the same as the beneficial effects of the video compression method provided by the above embodiment, and other technical features of the electronic device are the same as those disclosed by the above embodiment method, which are not repeated herein.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the foregoing description of embodiments, the particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
EXAMPLE five
The present embodiment provides a computer-readable storage medium having computer-readable program instructions stored thereon for performing the method of the video compression method in the above-described embodiments.
The computer readable storage medium provided by the embodiments of the present application may be, for example, a usb disk, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the above. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable storage medium may be embodied in an electronic device; or may be present alone without being incorporated into the electronic device.
The computer readable storage medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a video to be compressed, and performing down-sampling on the video to be compressed to obtain a down-sampled video; coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video; and performing up-sampling on the down-sampled compressed video to obtain a target video.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the names of the modules do not in some cases constitute a limitation of the unit itself.
The computer-readable storage medium provided by the application stores computer-readable program instructions for executing the video compression method, and solves the technical problem of low compression efficiency of video compression. Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by the embodiment of the present application are the same as those of the video compression method provided by the foregoing implementation, and no further description is given here.
EXAMPLE six
The present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the video compression method as described above.
The computer program product provided by the application solves the technical problem of low compression efficiency of video compression. Compared with the prior art, the beneficial effects of the computer program product provided by the embodiment of the present application are the same as those of the video compression method provided by the above embodiment, and are not described herein again.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims (10)

1. A video compression method, characterized in that the video compression method comprises:
acquiring a video to be compressed, and performing down-sampling on the video to be compressed to obtain a down-sampled video;
coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain a down-sampled compressed video;
and performing up-sampling on the down-sampled compressed video to obtain a target video.
2. The video compression method as claimed in claim 1, wherein said down-sampling said video to be compressed to obtain a down-sampled video comprises:
performing time domain down-sampling on the video to be compressed to obtain a time domain down-sampled video, and performing space domain down-sampling on the time domain down-sampled video to obtain a down-sampled video; and/or the presence of a gas in the gas,
and performing space domain downsampling on the video to be compressed to obtain a space domain downsampled video, and performing time domain downsampling on the space domain downsampled video to obtain the downsampled video.
3. The video compression method as claimed in claim 1, wherein said step of coding and decoding each picture frame in the down-sampled video to compress the down-sampled video to obtain the down-sampled compressed video comprises:
selecting at least one down-sampling picture frame from the down-sampling video, and taking a decoded picture frame of each down-sampling picture frame at the last time step as a down-sampling decoding reference frame corresponding to the down-sampling picture frame;
determining a predicted frame of each of the down-sampled picture frames according to the down-sampled picture frame and a corresponding down-sampled decoded reference frame;
and determining a decoding reconstruction frame corresponding to each down-sampling picture frame according to the prediction frame and the down-sampling picture frame to obtain the down-sampling compressed video.
4. The video compression method of claim 3, wherein said determining a predicted frame for each of the downsampled picture frames based on the downsampled picture frame and a corresponding downsampled decoded reference frame comprises:
determining first relative motion data between each of the down-sampled picture frames and a down-sampled decoded reference frame corresponding to each of the down-sampled picture frames;
and predicting each downsampled picture frame according to the inter-frame feature vector in the first relative motion data and a preset inter-frame prediction model to obtain the predicted frame.
5. The video compression method of claim 3, wherein the step of determining, based on the predicted frame and the downsampled picture frame, a decoded reconstructed frame for each of the downsampled picture frames comprises:
determining the frame difference between each down-sampled picture frame and the corresponding predicted frame to obtain the predicted residual error corresponding to each down-sampled picture frame;
and determining a decoding reconstruction frame corresponding to each down-sampling picture frame according to the residual decoding value corresponding to the prediction residual and the prediction frame.
6. The video compression method of claim 1, wherein the step of upsampling the downsampled compressed video to obtain the target video comprises:
according to the time domain down-sampling flag bit corresponding to the down-sampled compressed video, performing time domain up-sampling on the down-sampled compressed video to obtain a time domain up-sampled video, and according to the space domain down-sampling flag bit corresponding to the down-sampled compressed video, performing space domain up-sampling on the time domain up-sampled video to obtain the target video; and/or the presence of a gas in the gas,
according to the airspace down-sampling zone bit corresponding to the down-sampling compressed video, performing airspace up-sampling on the down-sampling compressed video to obtain an airspace up-sampling video, and according to the time domain down-sampling zone bit corresponding to the down-sampling compressed video, performing time domain up-sampling on the airspace up-sampling video to obtain the target video.
7. The video compression method as claimed in claim 6, wherein said up-sampling the down-sampled compressed video according to the time domain down-sampling flag corresponding to the down-sampled compressed video to obtain the time domain up-sampled video comprises:
predicting first motion offset data between the down-sampling decoding reference frames according to second relative motion data between the down-sampling decoding reference frames corresponding to the down-sampling picture frames in the down-sampling compressed video;
according to the first motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each down-sampling decoding reference frame to obtain at least one first alignment frame;
performing information aggregation on each first alignment frame to obtain multi-frame aggregation information;
and determining the time domain up-sampling video according to the multi-frame aggregation information and the time domain sampling flag bit.
8. The video compression method as claimed in claim 6, wherein said step of spatially upsampling said temporally upsampled video according to a spatial downsampling flag corresponding to said downsampled compressed video to obtain said target video comprises:
predicting second motion offset data between the downsampled picture frame and each downsampled decoded reference frame according to third relative motion data between the downsampled decoded reference frame and the downsampled picture frame in the time-domain upsampled video;
according to the second motion offset data and a preset inter-frame alignment model, performing inter-frame alignment on each downsampling decoding reference frame to obtain at least one second pair Ji Zhen;
performing information fusion on each second alignment frame to obtain multi-frame fusion information;
and determining the target video according to the multi-frame fusion information and the airspace down-sampling zone bit.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the video compression method of any one of claims 1 to 8.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a program for implementing a video compression method, the program being executed by a processor to implement the steps of the video compression method according to any one of claims 1 to 8.
CN202210663387.3A 2022-06-13 2022-06-13 Video compression method, electronic device and readable storage medium Pending CN115209160A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210663387.3A CN115209160A (en) 2022-06-13 2022-06-13 Video compression method, electronic device and readable storage medium
PCT/CN2022/121439 WO2023240835A1 (en) 2022-06-13 2022-09-26 Video compression method, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210663387.3A CN115209160A (en) 2022-06-13 2022-06-13 Video compression method, electronic device and readable storage medium

Publications (1)

Publication Number Publication Date
CN115209160A true CN115209160A (en) 2022-10-18

Family

ID=83576338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210663387.3A Pending CN115209160A (en) 2022-06-13 2022-06-13 Video compression method, electronic device and readable storage medium

Country Status (2)

Country Link
CN (1) CN115209160A (en)
WO (1) WO2023240835A1 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7596179B2 (en) * 2002-02-27 2009-09-29 Hewlett-Packard Development Company, L.P. Reducing the resolution of media data
JP6042899B2 (en) * 2012-09-25 2016-12-14 日本電信電話株式会社 Video encoding method and device, video decoding method and device, program and recording medium thereof
CN108495130B (en) * 2017-03-21 2021-04-20 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, terminal, server and storage medium
CN109905717A (en) * 2017-12-11 2019-06-18 四川大学 A kind of H.264/AVC Encoding Optimization based on Space-time domain down-sampling and reconstruction
CN111726636A (en) * 2019-03-18 2020-09-29 四川大学 HEVC (high efficiency video coding) coding optimization method based on time domain downsampling and frame rate upconversion
CN111800630A (en) * 2019-04-09 2020-10-20 Tcl集团股份有限公司 Method and system for reconstructing video super-resolution and electronic equipment
CN112019846A (en) * 2020-07-26 2020-12-01 杭州皮克皮克科技有限公司 Adaptive coding method, system, device and medium based on deep learning

Also Published As

Publication number Publication date
WO2023240835A1 (en) 2023-12-21

Similar Documents

Publication Publication Date Title
US8982135B2 (en) Information processing apparatus and image display method
US11482257B2 (en) Image display method and apparatus
WO2022135092A1 (en) Screen sharing display method and apparatus, device, and storage medium
CN110856036A (en) Remote desktop implementation method, interaction method, device, equipment and storage medium
JP2016531502A (en) Technology for low power image compression and display
CN110287810B (en) Vehicle door motion detection method, device and computer readable storage medium
CN111385576B (en) Video coding method and device, mobile terminal and storage medium
JP7374137B2 (en) Adaptive resolution video coding
CN115359226A (en) Texture compression-based VR display method for Hongmong system, electronic device and medium
CN111383329B (en) Three-dimensional image display method and device based on browser and electronic equipment
CN115209160A (en) Video compression method, electronic device and readable storage medium
CN116248889A (en) Image encoding and decoding method and device and electronic equipment
CN113747242B (en) Image processing method, image processing device, electronic equipment and storage medium
CN112954357A (en) Dynamic efficient self-adaptive video stream intelligent coding and decoding method and system
CN117979021A (en) End-to-end video transmission method, apparatus, device and computer readable storage medium
CN115396672B (en) Bit stream storage method, device, electronic equipment and computer readable medium
CN114979749B (en) Graphic interface drawing method, electronic equipment and readable storage medium
CN118158430A (en) Video parallel encoding method, apparatus, electronic device and computer readable medium
CN110263797B (en) Method, device and equipment for estimating key points of skeleton and readable storage medium
CN111435995B (en) Method, device and system for generating dynamic picture
CN114979792B (en) Display device control method and device, electronic device and readable storage medium
CN114630129A (en) Video coding and decoding method and device based on intelligent digital retina
CN115379207A (en) Camera simulation method and device, electronic equipment and readable medium
CN116489362A (en) Image processing method and device
CN114760525A (en) Video generation and playing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination