CN118368443B - Transformation method applied to video and image processing - Google Patents
Transformation method applied to video and image processing Download PDFInfo
- Publication number
- CN118368443B CN118368443B CN202410776166.6A CN202410776166A CN118368443B CN 118368443 B CN118368443 B CN 118368443B CN 202410776166 A CN202410776166 A CN 202410776166A CN 118368443 B CN118368443 B CN 118368443B
- Authority
- CN
- China
- Prior art keywords
- video
- transformation
- image
- data
- error
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims abstract description 44
- 238000011426 transformation method Methods 0.000 title claims abstract description 15
- 230000009466 transformation Effects 0.000 claims abstract description 98
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims abstract description 9
- 238000012805 post-processing Methods 0.000 claims abstract description 9
- 230000006870 function Effects 0.000 claims abstract description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 28
- 238000000034 method Methods 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 7
- 238000005259 measurement Methods 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000007667 floating Methods 0.000 claims description 3
- 238000013178 mathematical model Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 9
- 238000013139 quantization Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 230000003044 adaptive effect Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000008713 feedback mechanism Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 239000000306 component Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000001737 promoting effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000003874 inverse correlation nuclear magnetic resonance spectroscopy Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention relates to a transformation method applied to video and image processing, which comprises the steps of constructing and loading a video and image self-adaptive error prediction model, wherein the video and image self-adaptive error prediction model is trained based on video and image historical transformation data; pre-processing the input video and image data, ready for input to the DCT/IDCT transform basic step; performing the basic steps of DCT/IDCT transformation while recording key intermediate variables in each stage of transformation; running a self-adaptive error prediction model, and predicting an error range of each step according to the current transformation characteristics and historical data; after all transformation steps are completed, post-processing operation is carried out, post-processing is carried out after positive transformation, integer matrix and right shift operation are applied, shift parameters are fixed, and output is adjusted through addition, multiplication and functions.
Description
Technical Field
The invention belongs to the field of video processing, and particularly relates to a transformation method applied to video and image processing.
Background
Video and image processing technology plays a vital role in modern communication, entertainment, monitoring, medical diagnostics and many scientific research fields. With the rapid development of the digitization age, the demands for high-efficiency and high-quality video and image encoding and decoding are increasing, which requires processing algorithms to be capable of maintaining higher compression efficiency without sacrificing excessive visual quality. In many image and video compression technologies, discrete Cosine Transform (Discrete Cosine Transform, DCT) and its inverse Transform (INVERSE DISCRETE Cosine Transform, IDCT) are a core component of a series of international standards such as JPEG, MPEG, etc. due to their excellent energy concentrating properties.
However, the conventional DCT/IDCT method faces several significant technical challenges in practical applications. First, DCT is prone to accumulate errors during the transition from the spatial domain to the frequency domain, especially in successive multi-level transform operations, which mainly result from quantization processes, limited precision numerical operations (e.g., addition, subtraction, shifting operations), and algorithmically implemented approximations. These accumulated errors can significantly reduce the visual quality and compression efficiency of the reconstructed image, especially in application scenarios that are sensitive to image details and edges. Second, conventional DCT/IDCT algorithms often lack adaptivity, i.e., cannot be flexibly adjusted to achieve optimal performance in the face of different types of video content (e.g., high dynamic range video, video in low light conditions, or video containing fast motion scenes) and image quality requirements.
Furthermore, the processing of accumulated errors in the prior art mostly relies on post-processing techniques such as deblocking filtering, which tend to be a passive countermeasure to the errors rather than active prevention and accurate correction. Therefore, it is particularly urgent to develop a novel DCT/IDCT method that can actively predict and compensate errors during transformation while maintaining the complexity of the algorithm controllable.
Disclosure of Invention
The present invention is directed to a transformation method applied to video and image processing, so as to solve the problems set forth in the background art.
In order to solve the technical problems, the invention provides the following technical scheme:
The transformation method applied to video and image processing comprises the steps of constructing and loading a video and image self-adaptive error prediction model, wherein the video and image self-adaptive error prediction model is trained based on historical transformation data of the video and image; pre-processing the input video and image data, ready for input to the DCT/IDCT transform basic step; performing the basic steps of DCT/IDCT transformation while recording key intermediate variables in each stage of transformation;
running a self-adaptive error prediction model, and predicting an error range of each step according to the current transformation characteristics and historical data;
After all transformation steps are completed, post-processing operation is carried out, post-processing is carried out after positive transformation, integer matrix and right shift operation are applied, shift parameters are fixed, and output is adjusted through addition, multiplication and functions.
Further, the construction of the video and image adaptive error prediction model is specifically as follows: collecting a plurality of DCT/IDCT transformation data from historical video and image processing instances, the video and image data comprising intermediate results of the transformation, final output, and error measurements;
extracting features of video and image data, wherein the features of the video and image data comprise pixel differences before and after transformation;
normalizing the video and image data characteristics to eliminate dimension influence;
Selecting a polynomial regression model to construct a video and image self-adaptive error prediction model, and estimating parameters of the video and image self-adaptive error prediction model by using a least square method to ensure that the video and image self-adaptive error prediction model can be optimally fitted with error distribution in training data;
Cross-verifying the video and image self-adaptive error prediction model, dividing the video and image processing data into a training set and a verification set, and adjusting the video and image self-adaptive error prediction model according to the cross-verification result, and adjusting parameters of an optimization algorithm until satisfactory prediction performance is obtained; the optimized mathematical model is converted into an algorithm code, so that the algorithm code is convenient to load and call in the actual video and image processing flow.
Further, the video and image adaptive error prediction model is:
the corresponding least square parameter estimation formula is:
wherein P is a design matrix whose ith behavior E is the error vector.
Further, the pretreatment flow comprises the following steps:
Adjusting the video frames and the images to the size of target processing, ensuring the frame rate of all frames to be consistent for the video in the video and image data, and carrying out necessary frame inserting or frame extracting processing; the denoising algorithm is applied to reduce image noise in video and image data, and gamma correction is carried out on the image by adjusting the brightness and contrast of the image, and meanwhile, the data distribution is optimized, so that the image is more suitable for DCT transformation; dividing the video frame and image into a plurality of blocks of fixed format pixels;
For a video sequence, inter-frame alignment is carried out, so that the motion consistency between continuous frames is ensured, and the misprediction caused by motion is reduced; the image data is converted to the required precision format, such as integer or floating point, to match the requirements of the DCT/IDCT algorithm.
Further, the error range of each step is predicted according to the current transformation characteristics and the historical data specifically as follows: extracting features related to transformation from a video frame or image block to be processed currently, normalizing data, combining historical transformation data, picking up a historical case closest to the current data based on feature similarity, taking the historical case as a reference basis for error prediction, determining similarity by adopting distance measurement, training a polynomial regression model by using the integrated historical data and features, wherein the model aims at learning an error pattern in the historical data and predicting error distribution possibly occurring in the current transformation step,
Taking the normalized characteristics of the current data as input, and sending the normalized characteristics into a trained error prediction model;
Error range prediction: the model outputs a possible error range or a specific error value for each step of the current transformation according to the input characteristics, analyzes the predicted error range, determines the transformation step or intermediate variable with the most obvious influence on the final output, preferentially performs error compensation, dynamically formulates a compensation strategy based on the predicted error, and comprises but is not limited to adjusting coefficients, directly adding compensation values at specific positions or adjusting calculation sequences.
Further, the DCT/IDCT transformation basic steps comprise performing one-dimensional inverse transformation on the video and the image, selecting integer operation, and calculating the video and the image by using multiplication coefficients to obtain one-dimensional inverse transformed video and image output data;
performing inverse transformation post-treatment, and performing right shift operation on the one-dimensional inverse transformation result;
Preprocessing the forward transformation of the video and the image, performing left shift operation on the input video and the image data, and fixing shift parameters;
And carrying out one-dimensional forward transformation on the video and the image, selecting integer operation, calculating the video and the image by using multiplication coefficients, and obtaining output data of the one-dimensional forward transformation in the opposite direction to the one-dimensional reverse transformation.
The beneficial effects are that: the transformation method applied to video and image processing, provided by the application, realizes effective control of accumulated errors in the DCT/IDCT transformation process by introducing the self-adaptive error prediction model and the dynamic compensation mechanism, and remarkably enhances the quality and efficiency of video and image processing, and the specific technical effects are as follows:
Remarkably improves visual quality and compression efficiency: the error range of each transformation step is accurately predicted through the self-adaptive error prediction model, and error compensation is pertinently implemented, so that accumulated errors common in multi-stage transformation are effectively restrained, the visual quality of a reconstructed video and an image is remarkably improved, meanwhile, higher compression efficiency is maintained, and information loss is reduced.
Enhancing the adaptability and robustness of the algorithm: the method can dynamically adjust the compensation strategy according to the characteristics of the input video and the image, can flexibly adapt to the high dynamic range video, the low illumination environment or the fast motion scene, ensures that the optimal processing effect can be achieved under various conditions, and improves the generality and the robustness of the algorithm.
Optimizing computing resources and processing speed: although the error prediction and compensation steps are added, due to the adoption of efficient algorithms such as a polynomial regression model, unnecessary calculation is reduced through preprocessing and intelligent adjustment of intermediate variables, the overall calculation complexity is controlled, the high efficiency of the processing flow is maintained, and the method is beneficial to real-time video processing and application of a large-scale data set.
Intelligent feedback and continuous optimization: through an intelligent feedback mechanism, model parameters and a compensation strategy are continuously adjusted according to the deviation between actual output and an expected result, continuous self-optimization of algorithm performance is realized, stability and accuracy of long-term operation are ensured, and the requirement of manual intervention is reduced.
Promoting standardization and compatibility: the technical scheme optimizes on the basis of not changing the existing DCT/IDCT basic architecture, ensures the compatibility with the existing video coding and decoding standards (such as JPEG, MPEG and the like), is convenient for seamless integration and application in the prior art framework, and accelerates the popularization and popularization of the technology.
Drawings
Fig. 1 is a flow chart of a transformation method applied to video and image processing.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The application discloses a transformation method applied to video and image processing, referring to fig. 1, comprising the steps of: s1, constructing and loading a video and image self-adaptive error prediction model, wherein the video and image self-adaptive error prediction model is trained based on video and image historical transformation data, and comprises, but is not limited to, linear testing and random number generation testing data sets; the construction of the video and image self-adaptive error prediction model is specifically as follows: collecting a plurality of DCT/IDCT transformation data from historical video and image processing instances, the video and image data comprising intermediate results of the transformation, final output, and error measurements;
extracting features of video and image data, wherein the features of the video and image data comprise pixel differences before and after transformation;
normalizing the video and image data characteristics to eliminate dimension influence;
Selecting a polynomial regression model to construct a video and image self-adaptive error prediction model, and estimating parameters of the video and image self-adaptive error prediction model by using a least square method to ensure that the video and image self-adaptive error prediction model can be optimally fitted with error distribution in training data;
Cross-verifying the video and image self-adaptive error prediction model, dividing the video and image processing data into a training set and a verification set, and adjusting the video and image self-adaptive error prediction model according to the cross-verification result, and adjusting parameters of an optimization algorithm until satisfactory prediction performance is obtained; the optimized mathematical model is converted into an algorithm code, so that the algorithm code is convenient to load and call in the actual video and image processing flow.
The general error e can be predicted from a polynomial combination of m features in the form of:
Where d is the highest order of the polynomial, Is the total number of model parameters (including constant terms),
Is the model parameter to be solved.
Model parameters are estimated using least squares, i.e. minimizing the sum of squares of residuals:
where N is the number of samples, Is the value of the ith sample at the jth feature polynomial term (including the individual order and interaction terms of X i_norm).
The data set is divided into a training set and a validation set, and after using the training set parameter estimates, model performance, such as Mean Square Error (MSE) or decision coefficients, is estimated on the validation set. According to the verification result, the polynomial order d and the model parameter beta are adjusted through a grid search, random search or gradient descent method until the model performance meets the preset standard.
Specific mathematical formula example
Two features are considered in the applicationThe second order polynomial model of (3) without considering the interaction term, the model is:
the corresponding least square parameter estimation formula is:
wherein P is a design matrix whose ith behavior E is the error vector.
S2, preprocessing the input video and image data, and preparing to be input to a DCT/IDCT transformation basic step; the DCT/IDCT transformation basic steps include: performing one-dimensional inverse transformation on the video and the image, selecting integer operation, and calculating multiplication coefficients of the video and the image to obtain one-dimensional inverse transformed video and image output data;
performing inverse transformation post-treatment, and performing right shift operation on the one-dimensional inverse transformation result;
Preprocessing the forward transformation of the video and the image, performing left shift operation on the input video and the image data, and fixing shift parameters;
And carrying out one-dimensional forward transformation on the video and the image, selecting integer operation, calculating the video and the image by using multiplication coefficients, and obtaining output data of the one-dimensional forward transformation in the opposite direction to the one-dimensional reverse transformation.
The detailed pretreatment flow in S2:
adjusting the video frames and images to the size of the target process, typically a multiple of 8, to ensure compatibility with the DCT/IDCT block size;
For the video in the video and image data, ensuring that the frame rate of all frames is consistent, and carrying out necessary frame inserting or frame extracting treatment; a denoising algorithm (such as median filtering, bilateral filtering or non-local mean denoising) is applied to reduce image noise in video and image data, and the signal-to-noise ratio is improved;
The brightness and the contrast of the image are adjusted, so that the overall visual effect of the image is more uniform, and the details of dark parts and bright parts are enhanced; gamma correction is carried out on the image, so that display consistency on different devices is ensured, and meanwhile, data distribution is optimized to be more suitable for DCT transformation; dividing the video frames and images into blocks of pixels of a fixed format (e.g., 8 x 8);
For a video sequence, inter-frame alignment is carried out, so that the motion consistency between continuous frames is ensured, and the misprediction caused by motion is reduced; the image data is converted to the required precision format, such as integer or floating point, to match the requirements of the DCT/IDCT algorithm.
Through the preprocessing step, the input video and image data are optimized to the state most suitable for DCT/IDCT transformation, and meanwhile, high-quality input is provided for the self-adaptive error prediction model, so that the high efficiency and accuracy of the whole transformation process are ensured.
S3, performing DCT/IDCT transformation basic steps, and simultaneously recording key intermediate variables in each level of transformation;
In the present solution, the "key intermediate variables" refer to variables that are generated during the execution of the DCT/IDCT transform and have a direct influence on the final transform result. These variables are typically temporary stored data during execution of the transformation algorithm that reflect the critical state of the data change before and after the transformation. For 8x8 Discrete Cosine Transform (DCT) and Inverse Discrete Cosine Transform (IDCT), key intermediate variables include, but are not limited to, the following:
key intermediate variables in DCT transforms
Coefficient matrix in performing the DCT transform, a set of fixed orthogonal transform coefficient matrices is first used, which define how to transform from the spatial domain to the frequency domain. While these are not intermediate variables in the traditional sense, they are the core of the transform computation, and subsequent adaptive error compensation involves fine tuning of these coefficients.
Frequency domain coefficients-coefficient matrices generated after each 8x8 pixel block is subjected to DCT transformation, in particular, direct current components (DC coefficients) and alternating current components (AC coefficients), which are key variables directly reflecting the transformation result.
Quantization results quantized coefficients are also important intermediate variables if quantization operations are performed, since quantization processes introduce non-linear errors that need to be taken into account by subsequent error compensation mechanisms.
Key intermediate variables in IDCT transforms
Inverse quantization coefficients-if quantized after DCT, the coefficients or quantization table used in the inverse quantization process are key variables that determine the degree of recovery of the inverse quantized data.
Inverse transform coefficients the IDCT process likewise uses a set of fixed inverse transform coefficient matrices, which are the basis for recovering the spatial domain signal.
Reconstructing a matrix of pixels that are inverse IDCT transformed but not yet post-processed (e.g., deblocking), the pixel values being directly related to image quality.
Intermediate variable adjustment involved in adaptive error compensation
Prediction error-potential error distribution in the predicted current transformation step according to the adaptive error prediction model. These prediction error values are the basis for adjusting the intermediate variables.
Compensation value, specific compensation value or coefficient adjustment scheme calculated for prediction error. For example, for some key coefficients (e.g., DC coefficients), a compensation value is directly added to correct the predicted error; for other coefficients, the multiplication coefficients in the transformation or inverse transformation process are adjusted.
Intermediate calculations, these are temporary variables in a particular transformation stage or calculation step, which are adjusted according to an error compensation strategy to reduce accumulated errors.
The method comprises the following specific steps
And (3) error prediction, namely predicting errors generated in each step by using an error prediction model trained by historical data on the current transformation block.
Intermediate variable identification determines which intermediate variables (e.g., specific coefficients, quantization results, etc.) have the greatest impact on the final output and are most susceptible to error accumulation.
And (3) compensation calculation, namely calculating the compensation quantity required by each key intermediate variable according to the prediction error. This involves complex computational logic such as adjusting specific elements in the DCT/IDCT coefficient matrix based on the prediction error distribution.
Compensation is performed by modifying the values of these intermediate variables directly or by adjusting the parameters of the subsequent calculation steps after each stage of the transformation, applying the calculated compensation values.
Iterative optimization, namely continuously fine-tuning an error prediction model and a compensation strategy according to the deviation of actual output and an expected result through an intelligent feedback mechanism, so as to realize continuous optimization.
S4, running an adaptive error prediction model, and predicting an error range of each step according to the current transformation characteristics and historical data;
Extracting characteristics related to transformation from a current video frame or image block to be processed, normalizing data, combining historical transformation data, picking out a historical case closest to the current data based on the characteristic similarity, taking the historical case as a reference basis of error prediction, adopting distance measurement (such as Euclidean distance and cosine similarity) to determine the similarity, training a polynomial regression model by using the integrated historical data and the characteristics, and predicting error distribution possibly occurring in the current transformation step, wherein the model aims at learning an error mode in the historical data; model parameters are adjusted through methods such as cross validation, grid search or gradient descent, so as to optimize prediction performance;
Taking the normalized characteristics of the current data as input, and sending the normalized characteristics into a trained error prediction model;
Error range prediction: the model outputs a possible error range or a specific error value for each step of the current transformation according to the input characteristics, wherein the predicted values reflect the error distribution expectation based on the historical data and the current data characteristics;
Analyzing the predicted error range, determining which transformation steps or intermediate variables have the most significant effect on the final output, and needing to perform error compensation preferentially, and dynamically formulating a compensation strategy based on the predicted error, including but not limited to adjusting coefficients, directly adding compensation values at specific positions or adjusting calculation sequences;
I.e. fine tuning the intermediate variable according to the prediction error, including adjusting coefficients or directly adding compensation values at specific positions;
Through the series of steps, the technical scheme can dynamically predict the error range accurately according to the specific condition and past experience of the current transformation, and take measures to compensate, so that accumulated errors are effectively restrained, and the quality and efficiency of video and image processing are improved;
After all transformation steps are completed, post-processing operation is carried out, post-processing is carried out after positive transformation, integer matrix and right shift operation are applied, shift parameters are fixed, and output is adjusted through addition, multiplication and functions.
Through the steps, the scheme not only solves the problem of precision and calculation error accumulation in the prior art, but also improves the adaptability and robustness of the algorithm through an intelligent self-adaptive mechanism, so that the method becomes a more reliable and efficient transformation method in the field of video and image processing.
The transformation method applied to video and image processing, provided by the application, realizes effective control of accumulated errors in the DCT/IDCT transformation process by introducing the self-adaptive error prediction model and the dynamic compensation mechanism, and remarkably enhances the quality and efficiency of video and image processing, and the specific technical effects are as follows:
Remarkably improves visual quality and compression efficiency: the error range of each transformation step is accurately predicted through the self-adaptive error prediction model, and error compensation is pertinently implemented, so that accumulated errors common in multi-stage transformation are effectively restrained, the visual quality of a reconstructed video and an image is remarkably improved, meanwhile, higher compression efficiency is maintained, and information loss is reduced.
Enhancing the adaptability and robustness of the algorithm: the method can dynamically adjust the compensation strategy according to the characteristics of the input video and the image, can flexibly adapt to the high dynamic range video, the low illumination environment or the fast motion scene, ensures that the optimal processing effect can be achieved under various conditions, and improves the generality and the robustness of the algorithm.
Optimizing computing resources and processing speed: although the error prediction and compensation steps are added, due to the adoption of efficient algorithms such as a polynomial regression model, unnecessary calculation is reduced through preprocessing and intelligent adjustment of intermediate variables, the overall calculation complexity is controlled, the high efficiency of the processing flow is maintained, and the method is beneficial to real-time video processing and application of a large-scale data set.
Intelligent feedback and continuous optimization: through an intelligent feedback mechanism, model parameters and a compensation strategy are continuously adjusted according to the deviation between actual output and an expected result, continuous self-optimization of algorithm performance is realized, stability and accuracy of long-term operation are ensured, and the requirement of manual intervention is reduced.
Promoting standardization and compatibility: the technical scheme optimizes on the basis of not changing the existing DCT/IDCT basic architecture, ensures the compatibility with the existing video coding and decoding standards (such as JPEG, MPEG and the like), is convenient for seamless integration and application in the prior art framework, and accelerates the popularization and popularization of the technology.
In summary, the technical scheme of the application solves the core problem in the traditional DCT/IDCT method through a series of innovative improvements, improves the comprehensive performance of video and image processing through an intelligent means, and provides a powerful technical support for the development of the related field.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the present application.
Claims (4)
1. The transformation method applied to video and image processing is characterized by comprising the steps of constructing and loading a video and image self-adaptive error prediction model, wherein the video and image self-adaptive error prediction model is trained based on video and image historical transformation data; pre-processing the input video and image data, ready for input to the DCT/IDCT transform basic step; performing the basic steps of DCT/IDCT transformation while recording key intermediate variables in each stage of transformation;
running a self-adaptive error prediction model, and predicting an error range of each step according to the current transformation characteristics and historical data;
after all transformation steps are completed, post-processing operation is carried out, post-processing is carried out after positive transformation, integer matrix and right shift operation are applied, shift parameters are fixed, and output is adjusted through addition, multiplication and functions; the construction of the video and image self-adaptive error prediction model is specifically as follows: collecting a plurality of DCT/IDCT transformation data from historical video and image processing instances, the video and image data comprising intermediate results of the transformation, final output, and error measurements;
extracting features of video and image data, wherein the features of the video and image data comprise pixel differences before and after transformation;
normalizing the video and image data characteristics to eliminate dimension influence;
Selecting a polynomial regression model to construct a video and image self-adaptive error prediction model, and estimating parameters of the video and image self-adaptive error prediction model by using a least square method to ensure that the video and image self-adaptive error prediction model can be optimally fitted with error distribution in training data;
Cross-verifying the video and image self-adaptive error prediction model, dividing the video and image processing data into a training set and a verification set, and adjusting the video and image self-adaptive error prediction model according to the cross-verification result, and adjusting parameters of an optimization algorithm until satisfactory prediction performance is obtained; the optimized mathematical model is converted into an algorithm code, so that the algorithm code is convenient to load and call in the actual video and image processing flow.
2. The transformation method applied to video and image processing according to claim 1, wherein the preprocessing flow:
Adjusting the video frames and the images to the size of target processing, ensuring the frame rate of all frames to be consistent for the video in the video and image data, and carrying out frame inserting or frame extracting processing; the denoising algorithm is applied to reduce image noise in video and image data, and gamma correction is carried out on the image by adjusting the brightness and contrast of the image, and meanwhile, the data distribution is optimized, so that the image is more suitable for DCT transformation; dividing the video frame and image into a plurality of blocks of fixed format pixels;
For a video sequence, inter-frame alignment is carried out, so that the motion consistency between continuous frames is ensured, and the misprediction caused by motion is reduced; the image data is converted to the required precision format, such as integer or floating point, to match the requirements of the DCT/IDCT algorithm.
3. The transformation method applied to video and image processing according to claim 1, wherein predicting the error range of each step based on the current transformation characteristics and the history data is specifically: extracting features related to transformation from a video frame or image block to be processed currently, normalizing data, combining historical transformation data, picking up a historical case closest to the current data based on feature similarity, taking the historical case as a reference basis for error prediction, determining similarity by adopting distance measurement, training a polynomial regression model by using the integrated historical data and features, wherein the model aims at learning an error pattern in the historical data and predicting error distribution possibly occurring in the current transformation step,
Taking the normalized characteristics of the current data as input, and sending the normalized characteristics into a trained error prediction model;
Error range prediction: the model outputs a possible error range or a specific error value for each step of the current transformation according to the input characteristics, analyzes the predicted error range, determines the transformation step or intermediate variable with the most obvious influence on the final output, preferentially performs error compensation, dynamically formulates a compensation strategy based on the predicted error, and comprises adjusting coefficients, directly adding compensation values at specific positions or adjusting a calculation sequence.
4. The transformation method applied to video and image processing according to claim 1, wherein the basic step of DCT/IDCT transformation includes performing one-dimensional inverse transformation on the video and image, selecting integer arithmetic, and calculating the video and image using multiplication coefficients to obtain one-dimensional inverse transformed video and image output data;
performing inverse transformation post-treatment, and performing right shift operation on the one-dimensional inverse transformation result;
Preprocessing the forward transformation of the video and the image, performing left shift operation on the input video and the image data, and fixing shift parameters;
And carrying out one-dimensional forward transformation on the video and the image, selecting integer operation, calculating the video and the image by using multiplication coefficients, and obtaining output data of the one-dimensional forward transformation in the opposite direction to the one-dimensional reverse transformation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410776166.6A CN118368443B (en) | 2024-06-17 | 2024-06-17 | Transformation method applied to video and image processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410776166.6A CN118368443B (en) | 2024-06-17 | 2024-06-17 | Transformation method applied to video and image processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118368443A CN118368443A (en) | 2024-07-19 |
CN118368443B true CN118368443B (en) | 2024-08-30 |
Family
ID=91884839
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410776166.6A Active CN118368443B (en) | 2024-06-17 | 2024-06-17 | Transformation method applied to video and image processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118368443B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111819852A (en) * | 2018-03-07 | 2020-10-23 | 华为技术有限公司 | Method and apparatus for residual symbol prediction in transform domain |
CN112911289A (en) * | 2021-05-10 | 2021-06-04 | 杭州雄迈集成电路技术股份有限公司 | DCT/IDCT transformation optimization method and system |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101106714A (en) * | 2007-07-29 | 2008-01-16 | 浙江大学 | Conversion method for video and image processing |
US8885701B2 (en) * | 2010-09-08 | 2014-11-11 | Samsung Electronics Co., Ltd. | Low complexity transform coding using adaptive DCT/DST for intra-prediction |
CN104320668B (en) * | 2014-10-31 | 2017-08-01 | 上海交通大学 | HEVC/H.265 dct transform and the SIMD optimization methods of inverse transformation |
WO2017023829A1 (en) * | 2015-07-31 | 2017-02-09 | Stc.Unm | System and methods for joint and adaptive control of rate, quality, and computational complexity for video coding and video delivery |
JP6470474B2 (en) * | 2015-12-09 | 2019-02-13 | ドルビー ラボラトリーズ ライセンシング コーポレイション | Low-computation lookup table construction with reduced interpolation error |
KR102257016B1 (en) * | 2018-11-27 | 2021-05-26 | 에스케이텔레콤 주식회사 | No-reference quality assessment method and apparatus |
-
2024
- 2024-06-17 CN CN202410776166.6A patent/CN118368443B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111819852A (en) * | 2018-03-07 | 2020-10-23 | 华为技术有限公司 | Method and apparatus for residual symbol prediction in transform domain |
CN112911289A (en) * | 2021-05-10 | 2021-06-04 | 杭州雄迈集成电路技术股份有限公司 | DCT/IDCT transformation optimization method and system |
Also Published As
Publication number | Publication date |
---|---|
CN118368443A (en) | 2024-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110267041A (en) | Image encoding method, device, electronic equipment and computer readable storage medium | |
CN101039434B (en) | Video coding apparatus | |
CN106688232A (en) | Perceptual optimization for model-based video encoding | |
JP3655651B2 (en) | Data processing device | |
CN1440204A (en) | Method and device for computing coding dynamic images at fixed complicacy | |
CN102291581B (en) | Realizing method of self-adaptive motion estimation supporting frame field | |
CN108028937A (en) | video motion compensation device and method | |
CN107211161A (en) | The video encoding optimization of extending space including final stage processing | |
CN118368443B (en) | Transformation method applied to video and image processing | |
US10616585B2 (en) | Encoding data arrays | |
CN112399177A (en) | Video coding method and device, computer equipment and storage medium | |
CN112243129B (en) | Video data processing method and device, computer equipment and storage medium | |
CN101268623B (en) | Method and device for creating shape variable blocks | |
CN113709483B (en) | Interpolation filter coefficient self-adaptive generation method and device | |
CN114444666A (en) | Convolutional neural network training perception quantization method and device | |
CN109618155B (en) | Compression encoding method | |
CN109168000B (en) | HEVC intra-frame prediction rapid algorithm based on RC prediction | |
CN112954350A (en) | Video post-processing optimization method and device based on frame classification | |
CN112911286B (en) | Design method of sub-pixel interpolation filter | |
JP3860435B2 (en) | Moving picture coding apparatus and method, moving picture coding processing program, and recording medium for the program | |
CN109803147B (en) | Transformation processing method and device based on video texture features | |
CN118400530A (en) | 360-Degree video coding rate control method and system based on latitude self-adaption | |
CN116527915A (en) | Video frame processing method and device, electronic equipment and storage medium | |
Zhang et al. | Distortion-Aware Convolutional Neural Network-Based Interpolation Filter for AVS3 | |
CN118524215A (en) | H.264 quantitative training method, equipment and medium based on neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |