CN112001854A - Method for repairing coded image and related system and device - Google Patents

Method for repairing coded image and related system and device Download PDF

Info

Publication number
CN112001854A
CN112001854A CN202010673891.2A CN202010673891A CN112001854A CN 112001854 A CN112001854 A CN 112001854A CN 202010673891 A CN202010673891 A CN 202010673891A CN 112001854 A CN112001854 A CN 112001854A
Authority
CN
China
Prior art keywords
image
frame
frames
model
image frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010673891.2A
Other languages
Chinese (zh)
Inventor
陈瑶
方瑞东
林聚财
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202010673891.2A priority Critical patent/CN112001854A/en
Publication of CN112001854A publication Critical patent/CN112001854A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/77
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Abstract

The application discloses a method for restoring a coded image, a related system and a related device, wherein the method for restoring the coded image comprises the following steps: acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result; determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result; and repairing the image frame through an image repairing model matched with the prediction result. Through the mode, image quality prediction is firstly carried out on the damaged image frame in the subsequent repairing effect, the image repairing model corresponding to the image frame can be determined based on the corresponding prediction result, and the image frame is repaired, so that the real-time performance of image frame coding and repairing is effectively realized, the corresponding processing speed is improved, the self-adaptive capacity is strong, and the applicability is high.

Description

Method for repairing coded image and related system and device
Technical Field
The present application relates to the field of image restoration technologies, and in particular, to a method for restoring a coded image and a related system and device.
Background
With the rapid growth of internet video data, when video streams are transmitted over a wireless channel or a network, higher compression rates are generally used for video encoding in order to control the storage and transmission costs of videos. Common video coding compression standards, such as H.264/AVC (highly compressed digital video codec standard), typically employ lossy coding compression algorithms. While the compression algorithms reduce the video volume, compression artifacts such as blocking effect, ringing effect, breathing effect, etc. are introduced, which finally results in poor quality of the image received by the terminal. People are the ultimate recipients of video data and poor video quality affects the human viewing experience, and therefore some form of coded image restoration technique needs to be employed to improve the quality of the received video signal.
The image restoration in the traditional sense is that visually reasonable contents are filled into a defect or blocked area in a corresponding image, so that the image restoration is widely applied to application scenes such as cultural relic protection, movie special effects, blocked image filling, error concealment in image/video transmission and the like, has important application value, and is a research hotspot of computer graphics and computer vision. In the prior art, image restoration is usually realized by a deep learning method, for example, an image restoration model is obtained by training a self-coding full-convolution neural network or a challenge generation network.
Among them, coded image restoration is a technique for improving the quality of video images, rather than the conventional image restoration. Such images have no content loss, which is a loss of compression artifacts caused by block-based hybrid coding techniques. And the image quality of the video can be improved by carrying out decompression artifact on the image with poor quality after compression coding, so that the image quality is more easily accepted by users. The conventional video decompression artifact algorithm achieves the purpose by designing a Filter, for example, a linear low pass Filter, a linear or non-linear isotropic Filter, a Deblocking Filter (DF) in h.265/HEVC (high efficiency video coding standard) and Sample Adaptive Offset (SAO) are all effective to some extent to reduce blocking effect or ringing effect, but also bring some problems, for example, the linear low pass Filter blurs the recovered video; the linear or nonlinear isotropic filter needs to set the threshold of the filter according to experience, and the setting of the threshold usually has a large influence on the filtering result and is weak in robustness; the technology in H.265/HEVC greatly increases the complexity of coding, and affects the real-time performance of a coding algorithm.
The deep learning technique achieves excellent results in many fields by virtue of its excellent versatility and adaptability, and certainly includes video decompression artifact. The research in the direction is mainly divided into two aspects of in-loop filtering and out-of-loop filtering. In-loop filtering usually adopts a deep neural network to replace an original coding module so as to improve the overall coding quality, but the method can seriously affect the coding speed and has poor real-time performance. The out-of-loop filtering is to perform post-processing on the decoded video frame by using a deep neural network. The method does not affect the original video coding framework and can be suitable for different compression algorithms, but the algorithms still lack self-adaption capability at present.
Disclosure of Invention
The method for restoring the coded image can solve the problems of poor coding speed, poor real-time performance and lack of self-adaptive capacity in the process of restoring the coded image in the prior art.
In order to solve the above technical problem, the first technical solution adopted by the present application is: provided is a method for restoring a coded image, wherein the method for restoring the coded image comprises the following steps: acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result; determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result; and repairing the image frame through an image repairing model matched with the prediction result.
The method comprises the following steps of obtaining an image frame to be processed, predicting the image quality of the image frame and obtaining a prediction result, wherein the steps of obtaining the image frame to be processed and predicting the image quality of the image frame comprise: acquiring an image frame to be processed and characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame to acquire characteristic parameters corresponding to corresponding image restoration models; the image frame is obtained by carrying out compression coding and decoding on a corresponding lossless source image through setting quantization parameters, and the characteristic parameters comprise the setting quantization parameters or marks corresponding to the setting quantization parameters; determining the type of the image restoration model of the image frame based on the prediction result, and obtaining the image restoration model matched with the prediction result comprises the following steps: determining the type of an image restoration model of the image frame through the characteristic parameters to obtain an image restoration model corresponding to the characteristic parameters; the step of repairing the image frame through the image repairing model matched with the prediction result comprises the following steps: and repairing the image frame through an image repairing model corresponding to the characteristic parameters.
The method comprises the following steps of obtaining an image frame to be processed, predicting the image quality of the image frame and obtaining a prediction result, wherein the steps of obtaining the image frame to be processed and predicting the image quality of the image frame comprise: and acquiring the image frame to be processed and the characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame and a preset corresponding relation to obtain a prediction result.
The method comprises the following steps of obtaining an image frame to be processed and characteristic information of the image frame, predicting the image quality of the image frame according to the characteristic information of the image frame and a preset corresponding relation, and obtaining a prediction result, wherein the method further comprises the following steps: inputting image frames into a plurality of image restoration models to respectively restore the image frames, and establishing a corresponding relation between the characteristic information of the image frames and the image restoration model with the best restoration effect, so as to obtain a preset corresponding relation, wherein the number of the image frames comprises a plurality of image frames.
The method comprises the following steps of inputting image frames into a plurality of image restoration models to respectively restore the image frames, establishing a corresponding relation between the characteristic information of the image frames and the image restoration models with the best restoration effect, and acquiring a preset corresponding relation, wherein the characteristic information of the image frames comprises the code rate of the image frames: respectively inputting the image frames under the set target code rate into a plurality of image restoration models to respectively restore the image frames, and establishing a corresponding relation between the target code rate and the image restoration model with the best restoration effect so as to obtain a preset corresponding relation, wherein the number of the target code rates comprises a plurality of target code rates; the method comprises the steps of obtaining an image frame to be processed and feature information of the image frame, predicting the image quality of the image frame according to the feature information of the image frame and a preset corresponding relation, and obtaining a prediction result, wherein the step comprises the following steps: acquiring an image frame to be processed and a code rate of the image frame, and predicting the image quality of the image frame according to the code rate of the image frame and a preset corresponding relation to obtain a prediction result.
The method comprises the following steps of obtaining an image frame to be processed, predicting the image quality of the image frame and obtaining a prediction result, wherein the steps of obtaining the image frame to be processed and predicting the image quality of the image frame comprise: and acquiring an image frame to be processed, and inputting the image frame into the trained image quality prediction network model to predict the image quality of the image frame to obtain a prediction result.
The method comprises the following steps of obtaining image frames to be processed and characteristic information of the image frames according to the characteristic information of the image frames, wherein the characteristic information comprises the frame types of the image frames, predicting the image quality of the image frames according to the characteristic information of the image frames to obtain characteristic parameters corresponding to corresponding image restoration models, and comprises the following steps: the method comprises the steps of obtaining image frames to be processed and feature information of the image frames, inputting the image frames and the feature information of the image frames into a trained image quality prediction network model to obtain set quantization parameters corresponding to corresponding image restoration models, and obtaining marks corresponding to the feature parameters through frame types and set quantization parameter coding.
The image restoration model comprises an intra-frame image restoration model and an inter-frame image restoration model, and the image frame restoration step through the image restoration model corresponding to the characteristic parameters comprises the following steps: and repairing the image frame through an intra-frame image repairing model or an inter-frame image repairing model corresponding to the characteristic parameters.
The frame type comprises an I frame, a P frame and a B frame, and the step of repairing the image frame through an intra-frame image repairing model or an inter-frame image repairing model corresponding to the characteristic parameters comprises the following steps: and if the frame type of the image frame is an I frame, repairing the image frame through an intra-frame image repairing model corresponding to the characteristic parameters.
The method for restoring the coded image further comprises the following steps: and if the frame type of the image frame is a P frame and/or a B frame, repairing the image frame through an inter-frame image repairing model corresponding to the characteristic parameters.
The step of training the picture quality prediction network model comprises the following steps: coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames; the first preset network model is trained through a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters so as to establish an image quality prediction network model.
The step of training the first preset network model through a plurality of set quantification parameters and a plurality of image frames corresponding to the set quantification parameters to establish the image quality prediction network model comprises the following steps: inputting a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters into a first preset network model so as to respectively predict the image quality of the image frames through the preset network model and respectively give prediction quantization parameters corresponding to corresponding image restoration models; and continuously optimizing the errors between the set quantization parameters and the prediction quantization parameters corresponding to the set quantization parameters one by one through an error back propagation algorithm, thereby establishing an image quality prediction network model.
The intra-frame image restoration model is obtained through training, and the method for training the intra-frame image restoration model comprises the following steps: coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames; and training the second preset network model through a plurality of set quantization parameters and images with frame types of I frames in a plurality of corresponding image frames so as to establish an intra-frame image restoration model.
The second preset network model is integrated with a pyramid network model and a residual error network model, and the step of training the second preset network model through a plurality of set quantization parameters and images with frame types of I frames in a plurality of corresponding image frames so as to establish an intra-frame image restoration model comprises the following steps: inputting a plurality of set quantization parameters and images with frame types of I frames in a plurality of corresponding image frames into a second preset network model so as to respectively repair the images with the frame types of I frames in the plurality of image frames through the second preset network model; and continuously adjusting the layer number of the pyramid network model corresponding to the second preset network model and the depth of the residual error network model so that the repairing effect of the second preset network model on the image with the frame type I in the plurality of image frames exceeds a first set threshold value, thereby establishing the intra-frame image repairing model.
The inter-frame image restoration model is obtained through training, and the method for training the inter-frame image restoration model comprises the following steps: coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames; and training a third preset network model through a plurality of set quantization parameters and images of every two or more adjacent image frames in a plurality of image frames corresponding to the set quantization parameters, wherein the frame types of every two or more adjacent image frames are P frames and/or B frames, so as to establish an interframe image restoration model.
The third preset network model comprises an interframe matching block model, and is trained through images of every two or more adjacent frame types of a plurality of set quantization parameters and a plurality of corresponding image frames, namely P frames and/or B frames, so as to establish an interframe image repairing model, wherein the step of establishing the interframe image repairing model comprises the following steps: inputting a plurality of set quantization parameters and images of every two or more adjacent frame types of P frames and/or B frames in a plurality of image frames corresponding to the set quantization parameters into a third preset network model so as to repair the images of every two or more adjacent frame types of P frames and/or B frames in the plurality of image frames through the third preset network model respectively; and continuously adjusting the complexity corresponding to the third preset network model so that the repairing effect of the third preset network model on the images with the types of P frames and/or B frames adjacent to each other in the plurality of image frames exceeds a second set threshold value, thereby establishing the inter-frame image repairing model.
In order to solve the above technical problem, the second technical solution adopted by the present application is: provided is a smart terminal, wherein the smart terminal includes: the input module is used for acquiring an image frame to be processed; the front-end module is coupled with the input module and used for receiving the image frames sent by the input module and predicting the image quality of the image frames to obtain a prediction result; and the coding restoration module is coupled with the front-end module and used for receiving the prediction result sent by the front-end module, determining the type of the image restoration model of the image frame according to the prediction result, obtaining the image restoration model matched with the prediction result, and restoring the image frame through the image restoration model matched with the prediction result.
In order to solve the above technical problem, the third technical solution adopted by the present application is: providing an intelligent terminal, wherein the intelligent terminal comprises a memory and a processor which are coupled with each other; the memory stores program data; the processor is configured to execute the program data to implement the method of repairing an encoded image as described in any one of the above.
In order to solve the above technical problem, a fourth technical solution adopted by the present application is: the method comprises the steps of providing a coded image restoration system, wherein the coded image restoration system comprises an intelligent terminal and a camera connected with the intelligent terminal; the camera is used for acquiring a lossless source image corresponding to an image frame to be processed, and performing compression coding on the lossless source image to acquire a corresponding compression coding image; the intelligent terminal is used for receiving the compressed and coded image sent by the camera, decoding the compressed and coded image to obtain an image frame, predicting the image quality of the image frame to obtain a prediction result, determining the type of an image restoration model of the image frame based on the prediction result, obtaining an image restoration model matched with the prediction result, and restoring the image frame through the image restoration model matched with the prediction result.
In order to solve the above technical problem, a fifth technical solution adopted by the present application is: there is provided a computer readable storage medium having stored thereon program data executable by a processor to implement a method of repairing an encoded image as defined in any one of the above.
The beneficial effect of this application is: unlike the prior art, the method for restoring a coded image in the present application includes: acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result; determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result; and repairing the image frame through an image repairing model matched with the prediction result. Through the mode, image quality prediction is firstly carried out on the damaged image frame in the subsequent repairing effect, the image repairing model corresponding to the image frame can be determined based on the corresponding prediction result, and the image frame is repaired, so that the real-time performance of image frame coding and repairing is effectively realized, the corresponding processing speed is improved, the self-adaptive capacity is strong, and the applicability is high.
Drawings
FIG. 1 is a schematic flowchart of a first embodiment of a method for restoring an encoded image according to the present application;
FIG. 2 is a flowchart illustrating a second embodiment of the method for restoring an encoded image according to the present application;
FIG. 3 is a flowchart illustrating a third embodiment of the method for restoring an encoded image according to the present application;
FIG. 4 is a flowchart illustrating a fourth embodiment of the method for restoring an encoded image according to the present application;
FIG. 5 is a flowchart illustrating a first embodiment of a method for building an image quality prediction network model according to the present application;
FIG. 6 is a flowchart illustrating a second embodiment of a method for building an image quality prediction network model according to the present application;
FIG. 7 is a schematic structural diagram of an embodiment of an image quality prediction network model according to the present application;
FIG. 8 is a flowchart illustrating a first embodiment of a method for training an intra-frame image restoration model according to the present application;
FIG. 9 is a flowchart illustrating a second embodiment of the intra-frame image inpainting model training method according to the present application;
FIG. 10 is a flowchart illustrating a first embodiment of a method for training an inter-frame image inpainting model according to the present application;
FIG. 11 is a flowchart illustrating a second embodiment of the interframe image inpainting model training method according to the present application;
FIG. 12 is a flowchart illustrating an embodiment of an image inpainting model training method according to the present application;
FIG. 13 is a schematic structural diagram of an embodiment of an intelligent terminal according to the present application;
FIG. 14 is a schematic structural diagram of another embodiment of an intelligent terminal according to the present application;
FIG. 15 is a schematic structural diagram of an embodiment of a system for restoring an encoded image according to the present application;
FIG. 16 is a schematic structural diagram of an embodiment of a computer-readable storage medium according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for restoring an encoded image according to a first embodiment of the present application. The embodiment comprises the following steps:
s11: and acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result.
Specifically, firstly, a coded damaged code stream or a yuv image is obtained through a monitoring device or an intelligent terminal, wherein when the obtained image frame is the coded damaged code stream, the obtained image frame is decoded first to obtain a corresponding decoded yuv image, and the decoded yuv image or the directly obtained yuv image is determined as the image frame to be repaired.
Furthermore, the image frame to be processed is repaired by at least one image repairing model, and the image quality level which is possibly reached is predicted to obtain a corresponding prediction result.
It can be understood that after an image frame to be processed is acquired through a monitoring device or an intelligent terminal integrated with an end-to-end encoded image restoration frame, the image quality of the image frame is predicted based on the characteristic information of the image frame, for example, the code rate of the image frame, that is, each image quality level that may be achieved after the image frame is restored by using at least one image restoration model integrated in the encoded image restoration frame is predicted, so as to obtain the prediction effect of the image quality level that will be achieved after the image frame is restored by each image restoration model, and determine that the restoration effect is the best, that is, the image quality level can achieve the highest image restoration model.
S12: and determining the type of the image restoration model of the image frame based on the prediction result to obtain the image restoration model matched with the prediction result.
Further, after the result of image quality prediction of the image frame is obtained, the type of the corresponding image restoration model of the image frame is determined based on the prediction result, that is, one of the image restoration models with the best image quality effect obtained after restoration is determined, and the image restoration model is determined as the image restoration model matched with the prediction result.
S13: and repairing the image frame through an image repairing model matched with the prediction result.
Furthermore, the image frame to be processed is repaired through the determined image repairing model matched with the prediction result, so that the image quality of the image frame to be processed, which is damaged due to compression encoding and decoding, is effectively repaired, and the image quality of the image frame to be processed is more easily accepted by a user.
It can be understood that after the image frame to be processed is obtained, the image frame is firstly subjected to image quality prediction, so that when an image restoration model with the best restoration effect is found, the image frame is restored through the image restoration model, and therefore, the image restoration model with the best restoration effect can be automatically selected to be restored for the image frame with different strengths of compression artifacts, so that the image restoration model has stronger self-adaption capability, the processing flow and the real-time performance of the existing video coding and decoding algorithm are not influenced, the real-time performance of image frame coding and restoration is effectively realized, the corresponding restoration processing speed is improved while the restoration effect is ensured, and the image restoration model can be flexibly and conveniently applied to a plurality of different use scenes, and has strong self-adaption capability and high applicability.
Unlike the prior art, the method for restoring a coded image in the present application includes: acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result; determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result; and repairing the image frame through an image repairing model matched with the prediction result. Through the mode, image quality prediction is firstly carried out on the damaged image frame in the subsequent repairing effect, the image repairing model corresponding to the image frame can be determined based on the corresponding prediction result, and the image frame is repaired, so that the real-time performance of image frame coding and repairing is effectively realized, the corresponding processing speed is improved, the self-adaptive capacity is strong, and the applicability is high.
Referring to fig. 2, fig. 2 is a flowchart illustrating a second embodiment of a method for restoring an encoded image according to the present application. The embodiment comprises the following steps:
s21: acquiring an image frame to be processed and characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame to acquire characteristic parameters corresponding to corresponding image restoration models; the image frame is obtained by compressing and encoding and decoding a corresponding lossless source image through setting quantization parameters, and the characteristic parameters comprise the setting quantization parameters or marks corresponding to the setting quantization parameters.
Specifically, in the processing of internet video data, especially when uploading a video stream or a picture code stream in a wireless channel or a network, in order to control the storage and transmission costs of the corresponding video or picture code stream, video encoding is usually performed with a higher compression rate, that is, after a lossless source image is acquired through a monitoring device or an intelligent terminal, the lossless source image is first compressed and encoded by setting quantization parameters, so as to decode to obtain a corresponding image frame to be processed and characteristic information of the image frame, such as a yuv image and information of its width and height, code rate, frame type, and the like.
Further, the image quality of the image frame is predicted according to the feature information of the image frame to determine an image restoration model corresponding to the feature information of the image frame and feature parameters corresponding to the image restoration model, wherein the feature parameters specifically include set quantization parameters for compression encoding of the lossless source image or marks corresponding to the set quantization parameters, and the feature parameters can be understood as unique codes corresponding to the image restoration model, and each code uniquely corresponds to one image restoration model.
It can be understood that image frames to be processed with different feature information usually need image restoration models with different restoration strengths, and after determining the corresponding relationship between one or more feature information of the image frames to be processed and the image restoration models with different feature parameters and restoration effects thereof, the corresponding feature parameter of the image restoration model with the best restoration effect can be determined based on the obtained feature information of the image frames and the corresponding relationship, and then the image restoration model corresponding to the feature information of the image frames is determined.
S22: and determining the type of the image restoration model of the image frame through the characteristic parameters to obtain the image restoration model corresponding to the characteristic parameters.
Specifically, after the image quality of the image frame to be processed is predicted to obtain the characteristic parameters corresponding to the corresponding image restoration models, the type of the image restoration model of the image frame is further determined according to the characteristic parameters, so as to obtain the image restoration model uniquely corresponding to the characteristic parameters.
S23: and repairing the image frame through an image repairing model corresponding to the characteristic parameters.
Furthermore, the image frame is repaired through the image repairing model which is uniquely corresponding to the characteristic parameters, so that the image quality of the image frame to be processed, which is caused by compression encoding and decoding, is effectively repaired, and the image frame is more easily accepted by a user.
Referring to fig. 3, fig. 3 is a flowchart illustrating a method for restoring an encoded image according to a third embodiment of the present application. The method for restoring an encoded image according to this embodiment is a flowchart of a detailed embodiment of the method for restoring an encoded image in fig. 1, and includes the following steps:
s31: inputting image frames into a plurality of image restoration models to respectively restore the image frames, and establishing a corresponding relation between the characteristic information of the image frames and the image restoration model with the best restoration effect, so as to obtain a preset corresponding relation, wherein the number of the image frames comprises a plurality of image frames.
Specifically, a monitoring device or an intelligent terminal integrated with an end-to-end encoded image restoration framework includes a plurality of image restoration models with a set number, so as to respectively select one image restoration model with the best restoration effect from image frames to be processed with different compression artifacts, such as different blocking effects, ringing effects and breathing effects, to perform restoration processing on the image frames.
After a plurality of image frames with common characteristic information, such as common code rates, are obtained, the image frames are sequentially input into a plurality of image restoration models, so that the image frames are respectively restored through the image restoration models, the restoration effect of each image restoration model on each image frame is compared, the corresponding relation between the characteristic information of each image frame and the image restoration model with the best restoration effect is established, and the preset corresponding relation is obtained.
It can be understood that the preset corresponding relationship refers to a unique corresponding functional relationship established between the feature information of each image frame and the image restoration model with the best restoration effect, so that after the feature information of the image frame to be processed is obtained, the image restoration model corresponding to the image frame can be uniquely determined.
In another embodiment, after a lossless source image is obtained, the lossless source image may be further encoded through a plurality of different setting parameters to obtain encoded images with different quality levels, and a plurality of image frames to be processed and corresponding feature parameters thereof are obtained after corresponding decoding, so that the plurality of image frames can be sequentially input into a plurality of image restoration models to respectively restore each image frame, and a corresponding relationship is established between the feature information of each image frame and the image restoration model with the best restoration effect, thereby obtaining a preset corresponding relationship.
In another embodiment, the characteristic information of the image frame is selected as a code rate, the image frames at a set target code rate are respectively input into a plurality of image restoration models to respectively restore the image frames to obtain restored yuv images, and one image restoration model with the best restoration effect is compared and selected as the image restoration model at the target code rate to establish a corresponding relation. The number of the set target code rates can include a plurality of, and the image frames with a plurality of known code rates as the set target code rates are sequentially input into a plurality of image restoration models to establish corresponding correspondence one by one, so that the preset correspondence is obtained. In other embodiments, the feature information of the image frames may also be resolution or bit rate, so that the resolution or bit rate of each image frame can be in one-to-one correspondence with the image restoration model with the best restoration effect, thereby obtaining the preset correspondence.
It can be understood that the target code rate may be all common code rates that the image frame to be processed may have, so as to be able to adapt to a common application scenario, and the image frame is subjected to the restoration processing through the pre-integrated image restoration model corresponding to each current image frame with the common code rate.
S32: and acquiring the image frame to be processed and the characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame and a preset corresponding relation to obtain a prediction result.
Specifically, after the image frame to be processed and the feature information of the image frame are acquired, the image quality of the image frame is predicted according to the feature information of the image frame and a preset corresponding relationship which is established and acquired in advance to obtain a corresponding prediction result, that is, the image quality of the image frame after repair is predicted by comparison, and the image repair model with the best repair effect is determined.
In another embodiment, the feature information of the image frame is a code rate, so that when the image frame to be processed and the code rate of the image frame are acquired, the image quality of the image frame is predicted according to the code rate of the image frame and a preset corresponding relation established before, so as to obtain a corresponding prediction result.
In another embodiment, after the image frame to be processed is acquired, the image frame may be further input into a trained image quality prediction network model, so as to predict the image quality of the image frame through the image quality prediction network model and obtain a corresponding prediction result, without establishing and acquiring a corresponding preset corresponding relationship in advance.
S33: and determining the type of the image restoration model of the image frame based on the prediction result to obtain the image restoration model matched with the prediction result.
S34: and repairing the image frame through an image repairing model matched with the prediction result.
S33 and S34 are the same as S12 and S23 in fig. 1, respectively, and please refer to S12 and S23 and the related text description thereof, which are not repeated herein.
Referring to fig. 4, fig. 4 is a flowchart illustrating a fourth embodiment of a method for restoring an encoded image according to the present application. The method for restoring an encoded image according to this embodiment is a flowchart of a detailed embodiment of the method for restoring an encoded image in fig. 2, and includes the following steps:
s41: acquiring image frames to be processed and characteristic information of the image frames, inputting the image frames and the characteristic information of the image frames into a trained image quality prediction network model to acquire set quantization parameters corresponding to corresponding image restoration models, and encoding the set quantization parameters according to frame types to obtain marks corresponding to the characteristic parameters; the image frame is obtained by compressing and encoding and decoding a corresponding lossless source image through setting quantization parameters, and the characteristic parameters comprise the setting quantization parameters or marks corresponding to the setting quantization parameters.
Specifically, after feature information of an image frame to be processed and the image frame is acquired, the image frame and the feature information of the image frame are input into a trained image quality prediction network model, wherein the feature information of the image frame specifically includes a frame type of the image frame, so that a set quantization parameter of an image restoration model corresponding to the feature information of the image frame is given through the image quality prediction network model, a corresponding feature parameter is further obtained through combination of the frame type of the image frame and the set quantization parameter, and a corresponding mark is obtained through corresponding coding.
S42: and determining the type of the image restoration model of the image frame through the characteristic parameters to obtain the image restoration model corresponding to the characteristic parameters.
S42 is the same as S22 in fig. 2, and please refer to S22 and the related text description, which are not repeated herein.
S43: and repairing the image frame through an intra-frame image repairing model or an inter-frame image repairing model corresponding to the characteristic parameters.
Furthermore, the image frame is repaired through the intra-frame image repairing model or the inter-frame image repairing model which is uniquely corresponding to the characteristic parameters, so that the image quality of the image frame to be processed, which is damaged due to compression encoding and decoding, is effectively repaired, and the image frame is more easily accepted by a user.
It can be understood that, because the characteristic parameter is obtained by encoding the frame type of the image frame to be processed and the corresponding set quantization parameter, that is, the characteristic parameter can determine whether the image frame corresponds to the intra-frame image restoration model or the inter-frame image restoration model, and it is not necessary to select and judge the intra-frame image restoration model and the inter-frame image restoration model again. In other embodiments, the characteristic parameter may not include a frame type or a corresponding code of the image frame, so that when the current image frame to be processed is obtained, the frame type of the image frame is firstly distinguished to determine whether the corresponding image restoration model is an intra-frame image restoration model or an inter-frame image restoration model, and then one of the intra-frame image restoration model and the inter-frame image restoration model which uniquely corresponds to the image frame and has the best predicted restoration effect is further determined.
When the frame type of the image frame is determined to be an I frame, the image frame is repaired through an intra-frame image repair model corresponding to the characteristic parameters. And when the frame type of the image frame is determined to be a P frame and/or a B frame, the image frame is repaired through an inter-frame image repair model corresponding to the characteristic parameters.
Referring to fig. 5, fig. 5 is a flowchart illustrating a method for building a network model for image quality prediction according to a first embodiment of the present disclosure. The embodiment comprises the following steps:
s51: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a lossless source image is encoded through a plurality of set quantization parameters to obtain a plurality of corresponding compressed encoded images, and a plurality of to-be-processed image frames are obtained after the corresponding decoding. Therefore, the image frames encoded and decoded by a plurality of different set quantization parameters usually have different compression artifacts, and therefore, image restoration models with different strengths are required to respectively and correspondingly restore each image frame, so as to ensure that the final restoration effect is relatively optimal.
S52: the first preset network model is trained through a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters so as to establish an image quality prediction network model.
Furthermore, a first preset network model is trained through a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters, so that the predicted set quantization parameters corresponding to each image frame are given through the first preset network model, and the error between the predicted set quantization parameters and the actual values of the set quantization parameters is continuously optimized, thereby establishing the image quality prediction network model.
Referring to fig. 6, fig. 6 is a flowchart illustrating a second embodiment of a method for building an image quality prediction network model according to the present application. The embodiment comprises the following steps:
s61: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a set of QP values, i.e., a plurality of set quantization parameters, is used to encode the lossless source image to obtain a plurality of compressed encoded images with different quality levels after encoding, and a plurality of image frames are obtained after corresponding decoding.
S62: inputting a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters into a first preset network model, respectively predicting the image quality of the plurality of image frames through the preset network model, and respectively providing predicted quantization parameters corresponding to the corresponding image restoration models.
Furthermore, data composed of a plurality of set quantization parameters and a plurality of image frames decoded correspondingly to the set quantization parameters are input into a first preset network model as training data, so that image quality prediction is respectively carried out on the plurality of image frames through the first preset network model, and prediction quantization parameters corresponding to the image restoration models are respectively given.
S63: and continuously optimizing the errors between the set quantization parameters and the prediction quantization parameters corresponding to the set quantization parameters one by one through an error back propagation algorithm, thereby establishing an image quality prediction network model.
Further, errors between the set quantitative parameters and the predicted quantitative parameters are compared, so that the errors between the set quantitative parameters and the predicted quantitative parameters corresponding to the set quantitative parameters one by one are continuously optimized through an error back propagation algorithm, further, related internal parameters in the first preset network model are continuously optimized, the first preset network model is optimized in a mode of continuously reducing the errors between the predicted quantitative parameters corresponding to the first preset network model given by the set quantitative parameters, and the image quality predicted network model is built.
In a specific embodiment, a set of quantization parameters, that is, a plurality of QP values, are first used to encode an original lossless high-definition video or picture to obtain encoded compressed encoded images with different quality levels, and data formed by the compressed encoded images and corresponding QP values are used as training data to train an image quality prediction network model.
The image quality prediction network model may be designed by using a plurality of convolutional layers + a plurality of active layers + a plurality of fully connected layers. In addition, since the influence of the bit rate needs to be considered, the bit rate (for example, bpp, bit per pixel, which is obtained by converting the target bit rate and the video resolution) is used as an auxiliary label and added to the full link layer for training, and a specific network design structure diagram is shown in fig. 7, where fig. 7 is a structure diagram of an embodiment of the image quality prediction network model of the present application.
In the image quality prediction network model, the QP value of a certain input image frame i (i ═ 1, 2, 3.) is assumed to be xiAnd the QP value of the current video frame at a certain code rate is predicted to be x 'according to the image quality prediction network model'iThen, the absolute value loss function or the variance is used to define the error between the predicted value and the actual value, which can be specifically expressed as:
Figure BDA0002583347750000161
or
Figure BDA0002583347750000162
Wherein the content of the first and second substances,
Figure BDA0002583347750000163
indicating the error between predicted and actual valuesN (N ═ 1, 2, 3.) is the number of samples input to the network, and i denotes the sequence number of the training samples.
Furthermore, after the loss function is designed, the image quality prediction network model can be optimized by using an error back propagation algorithm, so that parameters in the image quality prediction network model are continuously optimized, and finally, the image quality prediction network model can distinguish the quality of the image frame, namely, the damage degree corresponding to the image frame.
The information finally output by the image quality prediction network model includes, but is not limited to, the following flags (attribute information): the type of frame being coded (e.g., an I/P/B frame may be assigned a flag of 0, 1, 2, respectively); the picture quality levels (for example, the QP values for encoding are 22, 27, 32, 37, 42, 47, and the picture quality levels flag are set to 0, 1, 2, 3, 4, 5, respectively), and other information may be further added if necessary.
It can be understood that after the image quality of the image frame to be processed is predicted by the image quality prediction network model, the obtained flags uniquely correspond to one image restoration model, that is, the flags are the corresponding marks of the characteristic parameters corresponding to the image restoration model.
Referring to fig. 8, fig. 8 is a flowchart illustrating a method for training an intra-frame image restoration model according to a first embodiment of the present application. The embodiment comprises the following steps:
s81: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a set of QP values, i.e., a plurality of set quantization parameters, is used to encode the lossless source image to obtain a plurality of compressed encoded images with different quality levels after encoding, and a plurality of image frames are obtained after corresponding decoding.
S82: and training the second preset network model through a plurality of set quantization parameters and images with frame types of I frames in a plurality of corresponding image frames so as to establish an intra-frame image restoration model.
Further, training a second preset network model through a plurality of set quantization parameters and images with frame types of I frames in a plurality of corresponding image frames, so as to repair the images with the frame types of I frames in the plurality of image frames through the second preset network model, and continuously optimizing internal parameters of the second preset network model, so that the repair effect on the images with the frame types of I frames in the plurality of image frames reaches a preset threshold value, thereby establishing the intra-frame image repair model.
Referring to fig. 9, fig. 9 is a flowchart illustrating a method for training an intra-frame image restoration model according to a second embodiment of the present application. The embodiment comprises the following steps:
s91: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a set of QP values, i.e., a plurality of set quantization parameters, is used to encode the lossless source image to obtain a plurality of compressed encoded images with different quality levels after encoding, and a plurality of image frames are obtained after corresponding decoding.
S92: and inputting the set quantization parameters and the images with the frame type I frames in the corresponding image frames into a second preset network model so as to respectively repair the images with the frame type I frames in the image frames through the second preset network model.
Furthermore, the set quantization parameters and the images with the frame type I frames in the corresponding image frames are input into a second preset network model, so that the images with the frame type I frames in the image frames are repaired through the second preset network model.
S93: and continuously adjusting the layer number of the pyramid network model corresponding to the second preset network model and the depth of the residual error network model so that the repairing effect of the second preset network model on the image with the frame type I in the plurality of image frames exceeds a first set threshold value, thereby establishing the intra-frame image repairing model.
Specifically, the second predetermined network model is integrated with a pyramid network model and a residual network model, so as to continuously adjust the level number of the pyramid network model and the depth of the residual network model corresponding to the second predetermined network model, for example, gradually increase the level number of the pyramid network model and the depth of the residual network model, in the process of repairing an image with a frame type of I frame in a plurality of image frames by the second predetermined network model, so as to stop increasing the level number of the pyramid network model and the depth of the residual network model when the repairing effect of the second predetermined network model on the image with the frame type of I frame in the plurality of image frames exceeds a first set threshold value, so as to effectively alleviate the problem of gradient disappearance caused by increasing unnecessary depth in the deep neural network while satisfying the repairing effect on the image with the frame type of I frame in the plurality of image frames, to reduce unnecessary computational overhead and thereby build an intra-frame image restoration model.
The first set threshold is set to enable the image quality of an image with a frame type of I frame in the repaired multiple image frames to meet the viewing requirement of the user, and the first set threshold may be specifically set by the user according to the requirement, which is not limited by the present application.
Referring to fig. 10, fig. 10 is a schematic flowchart illustrating a first embodiment of a method for training an inter-frame image inpainting model according to the present application. The embodiment comprises the following steps:
s101: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a set of QP values, i.e., a plurality of set quantization parameters, is used to encode the lossless source image to obtain a plurality of compressed encoded images with different quality levels after encoding, and a plurality of image frames are obtained after corresponding decoding.
S102: and training a third preset network model through a plurality of set quantization parameters and images of every two or more adjacent image frames in a plurality of image frames corresponding to the set quantization parameters, wherein the frame types of every two or more adjacent image frames are P frames and/or B frames, so as to establish an interframe image restoration model.
Further, training a third preset network model through a plurality of set quantization parameters and images, adjacent to each other, of two or more image frames corresponding to the set quantization parameters, wherein the two or more image frames are P frames or B frames, or the two or more adjacent image frames are P frames and B frames, so as to repair the images, adjacent to each other, of the two or more image frames, of the P frames and/or the B frames, through the third preset network model, and continuously optimize internal parameters of the third preset network model, so that the repairing effect of the images, adjacent to each other, of the two or more image frames, of the P frames and/or the B frames, of the image frames reaches a preset threshold value, and thus the inter-frame image repairing model is established.
Referring to fig. 11, fig. 11 is a flowchart illustrating a method for training an inter-frame image restoration model according to a second embodiment of the present application. The embodiment comprises the following steps:
s111: and coding the lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames.
Specifically, a set of QP values, i.e., a plurality of set quantization parameters, is used to encode the lossless source image to obtain a plurality of compressed encoded images with different quality levels after encoding, and a plurality of image frames are obtained after corresponding decoding.
S112: and inputting the set quantization parameters and the images of every two or more adjacent frame types of the plurality of image frames corresponding to the set quantization parameters into a third preset network model so as to repair the images of every two or more adjacent frame types of the plurality of image frames of the P frames and/or the B frames through the third preset network model.
Further, every two or more adjacent frame types in a plurality of set quantization parameters and a plurality of corresponding image frames are both P frames or B frames, or every two or more adjacent frames are images of P frames and/or B frames, and the images of every two or more adjacent frame types in the plurality of image frames are input into a third preset network model, so that the images of every two or more adjacent frame types of P frames and/or B frames in the plurality of image frames are repaired through the third preset network model.
S113: and continuously adjusting the complexity corresponding to the third preset network model so that the repairing effect of the third preset network model on the images with the types of P frames and/or B frames adjacent to each other in the plurality of image frames exceeds a second set threshold value, thereby establishing the inter-frame image repairing model.
Specifically, the third preset network model includes an inter-frame matching block model, so as to improve the robustness of the network model according to the reference relationship between adjacent frames, and continuously adjust the complexity corresponding to the third preset network model in the process of repairing, by the third preset network model, each adjacent two or more images with the frame type of P frame and/or B frame in the plurality of image frames, so that the repairing effect of the third preset network model on each adjacent two or more images with the frame type of P frame and/or B frame in the plurality of image frames exceeds a second set threshold, thereby establishing an inter-frame image repairing model to reduce unnecessary computation overhead.
The second set threshold is set to enable the image quality of each two or more adjacent repaired images with the frame types of P frames and/or B frames to meet the viewing requirements of the user, and may be specifically set by the user according to the needs, which is not limited in the present application.
In a specific embodiment, as shown in fig. 12, fig. 12 is a schematic flowchart of an embodiment of the image inpainting model training method according to the present application. The embodiment comprises the following steps:
specifically, a group of lossless high-definition video or picture data sets is collected first, so as to encode the data sets by using the h.264/h.265 encoding standard, and control the quality of this encoding by using a plurality of different QP values, for example, n (n is a positive integer) groups of image frames with different compression artifact strengths are generated according to the value of QP to serve as training data, and the size of n can be set according to actual needs, for example, a common group of settings is: 22, 27, 32, 37, 42, 47 to further construct a convolutional neural network-based image inpainting model for removing compression artifacts in the image frame, thereby training a plurality of image inpainting models.
Furthermore, according to the difference of the frame types (I/P/B frames) of the current image frame, the image restoration models are divided into two categories, namely an intra-frame image restoration model and an inter-frame image restoration model, so that after n groups of image frames with different compression artifact strengths are obtained, whether the current image frame is an I frame or not is judged.
Since the intra-frame (I-frame) image restoration model does not relate to the relationship between frames, the convolution network can be designed in many ways, for example, in a manner of a pyramid model plus a residual error network.
Firstly, processing input training data through a plurality of residual error networks, and then sending the processed training data into an m-layer pyramid network (m is a positive integer, and the value of the positive integer can be set as required) for further feature learning and processing. And the pyramid network performs 1/2, 1/4, 1/8, 1/16 and other dimensionality down-sampling on the previously learned image features, wherein the more pyramid layers, the larger the multiple of the down-sampling. In addition, the inside of each layer of pyramid network can be processed by connecting a plurality of residual error networks in series.
It can be understood that the pyramid network can extract the features of the image from coarse to fine in a continuous sampling mode, and the residual network improves the accuracy of the network model by increasing the depth and simultaneously relieves the problem of gradient disappearance caused by increasing the depth in the deep neural network.
The reference relationship between adjacent frames needs to be considered in an inter-frame (P/B frame) image restoration model, so a motion estimation method is firstly adopted to find a matching block in the adjacent frames, and this step can be obtained by a conventional motion estimation method (e.g., a diamond search method, a global search method, etc.) or a convolutional neural network-based method. When the convolutional neural network method is adopted, the model training can still be carried out by taking two or more adjacent frames of images as the input of the network by taking the pyramid network and the residual error network as reference. The step can be used as a selectable item in the design process of the interframe network model, and can be added to equipment with better performance to use the obtained matching block as the input of the interframe network so as to improve the robustness of the network model, and if the performance of the used equipment is poorer, the block at the common position of adjacent frames can be directly used as the input of the interframe network so as to reduce the calculation overhead.
After the input of the interframe image restoration model is determined, the design of the network can adopt the following two modes:
(a) n-path matching block pretreatment + intra-frame network model: adopting matching blocks of adjacent n frames as n paths of input (n values can be set by a user, such as 3 or 5), performing convolution operation on each path for m times (m is set by the user), and then serially connecting convolution results of the paths and sending the convolution results to the intra-frame image restoration model for subsequent processing;
(b) n-way matching block preprocessing + LSTM (Long Short-Term Memory, Long Short-Term Memory network) network model: after n-way matching block preprocessing is performed as in (a) above, the result is sent to the LSTM network for subsequent processing, where the value of n may be set relatively large, for example, 20 or 30.
In addition, there may be different levels of complexity of the network model for different QP values. When the QP value is small, the image texture is relatively complex, and a relatively complex network structure can be designed to extract image features, for example, a deeper network layer number or a greater filter number is adopted, and when the QP value is large, the image texture is relatively simple, and the complexity of the network can be appropriately reduced to reduce the computation amount of the network.
Further, for different application scenarios, such as on a PC (personal computer) side or a hardware device, the network model may be adjusted or tailored accordingly due to different computing resources. When the computing resources or the computing speed are limited, a lighter-weight network can be obtained by reducing the number of layers of the network or the number of filters, the number of layers of the pyramid and the like, and the network is used for equipment with poorer performance.
And training the image restoration model by using data formed by the image frames subjected to compression coding and lossless high-definition videos or pictures, and updating model parameters by combining an optimization algorithm of error back propagation so as to continuously reduce the training errors until convergence. The loss function corresponding to the image restoration model can adopt a mean square error between a lossless high-definition video or picture and an output result obtained through the image restoration model.
Furthermore, according to the image quality prediction result given by the front module, the determined flags are used for selecting an intra-frame image restoration model or an inter-frame image restoration model corresponding to the frame type (intra-frame/inter-frame) and the corresponding strength (selected according to the image quality level) to process the input image frame so as to realize the self-adaptive removal of image frame compression artifacts. For example, when the frame type flag is 0 and the picture quality level flag is 2, it indicates that an intra-frame image restoration model with a QP of 32 is selected, so as to perform image restoration processing on the current frame through the image restoration model, and so on, which is not described herein any more.
And when the image frame with the improved image quality is obtained after the restoration processing, the above process can be continuously repeated so as to sequentially carry out the restoration processing on the whole group of image frames until the restored yuv image is obtained after the current sequence is judged to be completely restored, and the corresponding restoration program is ended.
Based on the general inventive concept, the present application further provides an intelligent terminal, please refer to fig. 13, and fig. 13 is a schematic structural diagram of an embodiment of the intelligent terminal of the present application. In this embodiment, the intelligent terminal 131 includes an input module 1311, a front-end module 1312, and a code repair module 1313.
The input module 1311 is configured to acquire an image frame to be processed, and when the acquired image frame is a compressed and encoded image, first decode the image frame to obtain a corresponding decoded yuv image and feature information thereof, send the yuv image and the feature information thereof to the front-end module 1312, and when the acquired image frame is a yuv image, directly send the yuv image to the front-end module 1312.
The front end module 1312 is coupled to the input module 1311, and is configured to receive the image frame to be processed sent by the input module 1311, and predict the image quality of the image frame, so as to send the obtained prediction result to the coding repair module 1313.
The code restoration module 1313 is coupled to the front-end module 1312, and configured to receive the prediction result sent by the front-end module 1312, determine the type of the image restoration model of the image frame according to the prediction result, obtain an image restoration model matching the prediction result, and restore the image frame through the image restoration model matching the prediction result.
Optionally, the intelligent terminal 131 may be one of intelligent terminals having image encoding and decoding and repairing processing functions, such as a mobile phone, a tablet computer, a computer, and a server, which are not limited in this application.
Based on the general inventive concept, the present application further provides an intelligent terminal, please refer to fig. 14, and fig. 14 is a schematic structural diagram of another embodiment of the intelligent terminal of the present application.
The intelligent terminal 141 includes a memory 1411 and a processor 1412, which are coupled to each other, the memory 1411 stores program data, and the processor 1412 is configured to execute the program data to implement the method for repairing a coded image as described in any one of the above embodiments.
Based on the general inventive concept, the present application further provides a system for restoring an encoded image, please refer to fig. 15, and fig. 15 is a schematic structural diagram of an embodiment of the system for restoring an encoded image according to the present application. The system 151 for repairing the encoded image includes an intelligent terminal 1511 and a camera 1512 connected to the intelligent terminal 1511.
The camera 1512 is configured to obtain a lossless source image corresponding to the image frame to be processed, and compress and encode the lossless source image to obtain a corresponding compressed and encoded image, where the specific example of the lossless source image is that a monitoring area of the monitoring area is monitored and photographed in real time by a camera 1512 installed in a designated area, so that the obtained lossless source image is compressed and encoded, and then the encoded lossless source image is sent to the intelligent terminal 1511. In other embodiments, the camera 1512 may also directly send the obtained lossless source image to the intelligent terminal 1511, so as to perform encoding and decoding through the intelligent terminal 1511.
The intelligent terminal 1511 is configured to receive a compressed and encoded image sent by the camera 1512, decode the compressed and encoded image, and acquire an image frame to be processed, and the intelligent terminal 1511 is integrated with a plurality of image restoration models, so that after the image quality of the image frame is predicted and a corresponding prediction result is obtained, the type of the image restoration model of the image frame is determined based on the prediction result, and an image restoration model matched with the prediction result is obtained, so that the image frame can be restored by the image restoration model matched with the prediction result. In other embodiments, the camera 1512 may further decode the compressed and encoded image to obtain an image frame to be processed, and then input the image frame into the smart terminal 1511 for repair.
In another embodiment, the camera 1512 may also be integrated in the smart terminal 1511, for example, after a lossless source image is directly acquired by the camera 1512 carried on any smart terminal 1511, such as an unmanned aerial vehicle, a smart robot, a mobile phone, or a notebook computer, and is correspondingly compressed and encoded, a corresponding image frame is repaired by the processor of the smart terminal 1511.
Optionally, the intelligent terminal 131 may be one of intelligent terminals having image encoding and decoding and repairing processing functions, such as a mobile phone, a tablet computer, a computer, and a server, which are not limited in this application.
Based on the general inventive concept, the present application further provides a computer-readable storage medium, please refer to fig. 16, and fig. 16 is a schematic structural diagram of an embodiment of the computer-readable storage medium according to the present application. In which a computer readable storage medium 161 has stored therein program data 1611, the program data 1611 being executable to implement any of the above-described methods of restoration of an encoded image.
In one embodiment, the computer readable storage medium 161 may be a memory chip in a terminal, a hard disk, or other readable and writable storage means such as a mobile hard disk or a flash disk, an optical disk, or the like, and may also be a server, or the like.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a processor or a memory is merely a logical division, and an actual implementation may have another division, for example, a plurality of processors and memories may be combined to implement the functions or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or connection may be an indirect coupling or connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Unlike the prior art, the method for restoring a coded image in the present application includes: acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result; determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result; and repairing the image frame through an image repairing model matched with the prediction result. Through the mode, image quality prediction is firstly carried out on the damaged image frame in the subsequent repairing effect, the image repairing model corresponding to the image frame can be determined based on the corresponding prediction result, and the image frame is repaired, so that the real-time performance of image frame coding and repairing is effectively realized, the corresponding processing speed is improved, the self-adaptive capacity is strong, and the applicability is high.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (20)

1. A method for restoring a coded picture, the method comprising:
acquiring an image frame to be processed, and predicting the image quality of the image frame to obtain a prediction result;
determining the type of an image restoration model of the image frame based on the prediction result to obtain an image restoration model matched with the prediction result;
and repairing the image frame through the image repairing model matched with the prediction result.
2. The method for restoring an encoded image according to claim 1, wherein the step of acquiring an image frame to be processed and predicting the image quality of the image frame to obtain a prediction result comprises:
acquiring the image frame to be processed and the characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame to acquire the characteristic parameters corresponding to the corresponding image restoration model; the image frame is obtained by carrying out compression coding and decoding on a corresponding lossless source image through set quantization parameters, and the characteristic parameters comprise the set quantization parameters or marks corresponding to the set quantization parameters;
the determining the type of the image restoration model of the image frame based on the prediction result, and the obtaining of the image restoration model matched with the prediction result comprises:
determining the type of an image restoration model of the image frame according to the characteristic parameters to obtain an image restoration model corresponding to the characteristic parameters;
the repairing the image frame through the image repairing model matched with the prediction result comprises the following steps:
and repairing the image frame through the image repairing model corresponding to the characteristic parameters.
3. The method for restoring an encoded image according to claim 1, wherein the step of acquiring an image frame to be processed and predicting the image quality of the image frame to obtain a prediction result comprises:
and acquiring the image frame to be processed and the characteristic information of the image frame, and predicting the image quality of the image frame according to the characteristic information of the image frame and a preset corresponding relation to obtain the prediction result.
4. The method for restoring an encoded image according to claim 3, wherein before the step of obtaining the image frame to be processed and the feature information of the image frame, and predicting the image quality of the image frame according to the feature information of the image frame and a preset correspondence, and obtaining the prediction result, the method further comprises:
inputting the image frames into a plurality of image restoration models to respectively restore the image frames, and establishing a corresponding relation between the characteristic information of the image frames and the image restoration model with the best restoration effect to acquire the preset corresponding relation, wherein the number of the image frames comprises a plurality of image frames.
5. The method according to claim 4, wherein the feature information of the image frame includes a code rate of the image frame, the step of inputting the image frame into a plurality of image restoration models to respectively restore the image frame, and establishing a corresponding relationship between the feature information of the image frame and an image restoration model with a best restoration effect, so as to obtain the preset corresponding relationship comprises:
respectively inputting the image frames with set target code rates into a plurality of image restoration models to respectively restore the image frames, and establishing a corresponding relation between the target code rates and the image restoration models with the best restoration effect to obtain the preset corresponding relation, wherein the number of the target code rates comprises a plurality of target code rates;
the step of acquiring the image frame to be processed and the feature information of the image frame, and predicting the image quality of the image frame according to the feature information of the image frame and a preset corresponding relation to obtain the prediction result comprises the following steps:
and acquiring the image frame to be processed and the code rate of the image frame, and predicting the image quality of the image frame according to the code rate of the image frame and the preset corresponding relation to obtain the prediction result.
6. The method for restoring an encoded image according to claim 1, wherein the step of acquiring an image frame to be processed and predicting the image quality of the image frame to obtain a prediction result comprises:
and acquiring the image frame to be processed, and inputting the image frame into a trained image quality prediction network model to predict the image quality of the image frame to obtain the prediction result.
7. The method according to claim 2, wherein the feature information includes a frame type of the image frame, and the step of acquiring the image frame to be processed and the feature information of the image frame, and predicting the image quality of the image frame according to the feature information of the image frame to acquire the feature parameters corresponding to the corresponding image restoration model includes:
the image restoration method comprises the steps of obtaining the image frame to be processed and the characteristic information of the image frame, inputting the image frame and the characteristic information of the image frame into a trained image quality prediction network model to obtain set quantization parameters corresponding to a corresponding image restoration model, and obtaining marks corresponding to the characteristic parameters through the frame type and the set quantization parameter coding.
8. The method according to claim 7, wherein the image restoration models include an intra-frame image restoration model and an inter-frame image restoration model, and the step of restoring the image frame by the image restoration model corresponding to the feature parameter includes:
and repairing the image frame through the intra-frame image repairing model or the inter-frame image repairing model corresponding to the characteristic parameters.
9. The method according to claim 8, wherein the frame types include I-frame, P-frame, and B-frame, and the step of repairing the image frame by the intra-frame image repair model or the inter-frame image repair model corresponding to the characteristic parameter comprises:
and if the frame type of the image frame is an I frame, repairing the image frame through the intra-frame image repairing model corresponding to the characteristic parameter.
10. The method for restoring an encoded image according to claim 9, further comprising:
and if the frame type of the image frame is a P frame and/or a B frame, repairing the image frame through the inter-frame image repairing model corresponding to the characteristic parameters.
11. The method for restoring an encoded image according to claim 6 or 7, wherein the step of training the prediction network model comprises:
coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames;
and training a first preset network model through a plurality of set quantization parameters and a plurality of image frames corresponding to the set quantization parameters to establish the image quality prediction network model.
12. The method for restoring an encoded image according to claim 11, wherein the step of training a first predetermined network model by using a plurality of the predetermined quantization parameters and a plurality of the image frames corresponding to the predetermined quantization parameters to establish the image quality prediction network model comprises:
inputting a plurality of the set quantization parameters and a plurality of image frames corresponding to the set quantization parameters into the first preset network model, so as to respectively predict the image quality of the image frames through the first preset network model and respectively provide prediction quantization parameters corresponding to the image restoration models corresponding to the image frames;
and continuously optimizing the errors between the set quantization parameters and the prediction quantization parameters corresponding to the set quantization parameters one by one through an error back propagation algorithm, so as to establish the image quality prediction network model.
13. The method of claim 9, wherein the intra-frame image restoration model is trained, and the method of training the intra-frame image restoration model comprises:
coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames;
and training a second preset network model through a plurality of set quantization parameters and a plurality of images with frame types of I frames in the image frames corresponding to the set quantization parameters so as to establish the intra-frame image restoration model.
14. The method according to claim 13, wherein the second predetermined network model is integrated with a pyramid network model and a residual network model, and the step of training the second predetermined network model by using a plurality of the predetermined quantization parameters and the image with frame type I in a plurality of image frames corresponding to the predetermined quantization parameters to establish the intra-frame image restoration model comprises:
inputting the plurality of preset quantitative parameters and the images with the frame types of I frames in the plurality of image frames corresponding to the preset quantitative parameters into the second preset network model so as to respectively repair the images with the frame types of I frames in the plurality of image frames through the second preset network model;
and continuously adjusting the level number of the pyramid network model and the depth of the residual error network model corresponding to the second preset network model so that the repairing effect of the second preset network model on the image with the frame type of I frame in the image frames exceeds a first set threshold value, thereby establishing the intra-frame image repairing model.
15. The method of claim 9, wherein the inter-frame image restoration model is trained, and the method of training the inter-frame image restoration model comprises:
coding a lossless source image through a plurality of set quantization parameters to obtain a plurality of compressed coded images, and correspondingly decoding the compressed coded images to obtain a plurality of image frames;
and training a third preset network model through a plurality of set quantization parameters and images of every two or more adjacent image frames with the frame types of P frames and/or B frames in a plurality of image frames corresponding to the set quantization parameters so as to establish the interframe image restoration model.
16. The method according to claim 15, wherein the third predetermined network model comprises an inter-frame matching block model, and the step of training the third predetermined network model by using a plurality of the predetermined quantization parameters and corresponding images of two or more adjacent image frames of which the frame types are P frames and/or B frames to establish the inter-frame image restoration model comprises:
inputting the preset quantitative parameters and the images with the two or more adjacent frame types of P frames and/or B frames in the plurality of image frames corresponding to the preset quantitative parameters into the third preset network model so as to respectively repair the images with the two or more adjacent frame types of P frames and/or B frames in the plurality of image frames through the third preset network model;
and continuously adjusting the complexity corresponding to the third preset network model to enable the repairing effect of the third preset network model on every two or more adjacent images with the frame types of P frames and/or B frames in a plurality of image frames to exceed a second set threshold value, so as to establish the inter-frame image repairing model.
17. An intelligent terminal, characterized in that, intelligent terminal includes:
the input module is used for acquiring an image frame to be processed;
the front-end module is coupled with the input module and used for receiving the image frames sent by the input module and predicting the image quality of the image frames to obtain a prediction result;
and the coding restoration module is coupled with the front-end module and used for receiving the prediction result sent by the front-end module, determining the type of the image restoration model of the image frame according to the prediction result, obtaining the image restoration model matched with the prediction result, and restoring the image frame through the image restoration model matched with the prediction result.
18. An intelligent terminal, characterized in that the intelligent terminal comprises a memory and a processor coupled to each other;
the memory stores program data;
the processor is configured to execute the program data to implement a method of inpainting an encoded image as claimed in any one of claims 1 to 16.
19. The system for repairing the coded image is characterized by comprising an intelligent terminal and a camera connected with the intelligent terminal;
the camera is used for acquiring a lossless source image corresponding to an image frame to be processed, and performing compression coding on the lossless source image to acquire a corresponding compression coding image;
the intelligent terminal is used for receiving the compressed and coded image sent by the camera, decoding the compressed and coded image to obtain the image frame, predicting the image quality of the image frame to obtain a prediction result, determining the type of an image restoration model of the image frame based on the prediction result, obtaining an image restoration model matched with the prediction result, and restoring the image frame through the image restoration model matched with the prediction result.
20. A computer-readable storage medium, characterized in that the computer-readable storage medium stores program data executable to implement the method of restoration of an encoded image according to any one of claims 1 to 16.
CN202010673891.2A 2020-07-14 2020-07-14 Method for repairing coded image and related system and device Pending CN112001854A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010673891.2A CN112001854A (en) 2020-07-14 2020-07-14 Method for repairing coded image and related system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010673891.2A CN112001854A (en) 2020-07-14 2020-07-14 Method for repairing coded image and related system and device

Publications (1)

Publication Number Publication Date
CN112001854A true CN112001854A (en) 2020-11-27

Family

ID=73467565

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010673891.2A Pending CN112001854A (en) 2020-07-14 2020-07-14 Method for repairing coded image and related system and device

Country Status (1)

Country Link
CN (1) CN112001854A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112717389A (en) * 2020-12-31 2021-04-30 贵阳动视云科技有限公司 Cloud game image screen-splash repairing method and device
CN116112694A (en) * 2022-12-09 2023-05-12 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07111654A (en) * 1993-10-13 1995-04-25 Toshiba Corp Moving image transmission system
CN108141609A (en) * 2015-10-21 2018-06-08 夏普株式会社 Prognostic chart picture generating means, picture decoding apparatus and picture coding device
WO2018171447A1 (en) * 2017-03-21 2018-09-27 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, computer device and storage medium
CN111314698A (en) * 2020-02-27 2020-06-19 浙江大华技术股份有限公司 Image coding processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07111654A (en) * 1993-10-13 1995-04-25 Toshiba Corp Moving image transmission system
CN108141609A (en) * 2015-10-21 2018-06-08 夏普株式会社 Prognostic chart picture generating means, picture decoding apparatus and picture coding device
WO2018171447A1 (en) * 2017-03-21 2018-09-27 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, computer device and storage medium
CN111314698A (en) * 2020-02-27 2020-06-19 浙江大华技术股份有限公司 Image coding processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
崔子冠;朱秀昌;: "H.264图像复杂度自适应的I帧码率控制算法", 电子与信息学报, no. 11 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112717389A (en) * 2020-12-31 2021-04-30 贵阳动视云科技有限公司 Cloud game image screen-splash repairing method and device
CN116112694A (en) * 2022-12-09 2023-05-12 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training
CN116112694B (en) * 2022-12-09 2023-12-15 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training

Similar Documents

Publication Publication Date Title
CN111711824B (en) Loop filtering method, device and equipment in video coding and decoding and storage medium
CN111819854B (en) Method and apparatus for coordinating multi-sign bit concealment and residual sign prediction
CN111819852B (en) Method and apparatus for residual symbol prediction in the transform domain
CN106170092B (en) Fast coding method for lossless coding
US9414086B2 (en) Partial frame utilization in video codecs
KR101808327B1 (en) Video encoding/decoding method and apparatus using paddding in video codec
CN113766249B (en) Loop filtering method, device, equipment and storage medium in video coding and decoding
KR101482896B1 (en) Optimized deblocking filters
US20130182776A1 (en) Video Encoding Using Block-Based Mixed-Resolution Data Pruning
CN111837389A (en) Block detection method and device suitable for multi-sign bit hiding
CN112738511B (en) Fast mode decision method and device combined with video analysis
CN102577377A (en) Apparatus and method for deblocking filtering image data and video decoding apparatus and method using the same
CN111741299B (en) Method, device and equipment for selecting intra-frame prediction mode and storage medium
CN112913236B (en) Encoder, decoder and corresponding methods using compressed MV storage
CN113784126A (en) Image encoding method, apparatus, device and storage medium
CN112001854A (en) Method for repairing coded image and related system and device
CN113822824B (en) Video deblurring method, device, equipment and storage medium
CN116916036A (en) Video compression method, device and system
Wang et al. A low complexity compressed sensing-based codec for consumer depth video sensors
WO2012118569A1 (en) Visually optimized quantization
CN111971961A (en) Image processing apparatus and method for performing efficient deblocking
CN111212288B (en) Video data encoding and decoding method and device, computer equipment and storage medium
CN116982262A (en) State transition for dependent quantization in video coding
CN117136540A (en) Residual coding method and device, video coding method and device, and storage medium
WO2023019407A1 (en) Inter-frame prediction method, coder, decoder, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination