CN115396669A - Video compression method and device based on interest area enhancement - Google Patents

Video compression method and device based on interest area enhancement Download PDF

Info

Publication number
CN115396669A
CN115396669A CN202211006575.5A CN202211006575A CN115396669A CN 115396669 A CN115396669 A CN 115396669A CN 202211006575 A CN202211006575 A CN 202211006575A CN 115396669 A CN115396669 A CN 115396669A
Authority
CN
China
Prior art keywords
image
component
interest
region
enhancement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211006575.5A
Other languages
Chinese (zh)
Inventor
滕芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202211006575.5A priority Critical patent/CN115396669A/en
Publication of CN115396669A publication Critical patent/CN115396669A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/34Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/36Scalability techniques involving formatting the layers as a function of picture distortion after decoding, e.g. signal-to-noise [SNR] scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a video compression method and a device based on interest area enhancement, wherein the method comprises the following steps: carrying out transformation and quantization processing on the video frame to remove spatial redundant information; extracting an image interesting region through YOLOv4, converting RGB color space components into HSV space components, and then enhancing the interesting region; after image enhancement, compressing the brightness component in the data; taking a part of image data after the interest area is enhanced as training data, and training by utilizing the training data to generate a confrontation network; and decoding the compressed brightness component, inputting the decoded image into a generation countermeasure network, and coloring the image to obtain a decoded enhanced image. According to the invention, the interesting area is detected through YOLOv4, the image enhancement is carried out in the HSV space, only the image brightness component is compressed, and the video compression efficiency is improved.

Description

Video compression method and device based on interest area enhancement
Technical Field
The invention relates to the technical field of video compression, in particular to a video compression method and device based on interest area enhancement.
Background
With the wide application of high-definition video in multimedia communication equipment, high-efficiency video coding has attracted wide attention at home and abroad. In practical applications, demands for high-definition and high-contrast image processing are increasing, image quality is difficult to guarantee due to limitations in illumination conditions, exposure degrees, transmission bandwidths, storage capacities and the like, and requirements for image quality and details are increasing, but image detail distortion is still serious due to limitations in shooting environments, transmission environments and the like, and therefore, a compression technology for image enhancement is urgently needed to be provided to break through the technical bottleneck of image compression at present.
The image enhancement technology plays an important role in the wide application of images, and the main purpose of image enhancement is to inhibit useless noise doped in the image in the process of acquisition and transmission, highlight useful information in the image, enable the image to conform to the visual effect of people as much as possible, or enable the image to be converted into a form which is beneficial to computer recognition and analysis, and improve the subsequent processing capability and application value of the image.
Disclosure of Invention
In view of solving the above problems, an object of the present invention is to provide a method and an apparatus for video compression based on region of interest enhancement, which can improve video compression efficiency and enhance image quality.
In order to solve the problems, the technical scheme of the invention is as follows:
a video compression method based on region of interest enhancement comprises the following steps:
carrying out transformation and quantization processing on the video frame to remove spatial redundant information;
extracting an image interesting region through YOLOv4, converting RGB color space components into HSV space components, and then enhancing the interesting region;
after image enhancement, compressing the brightness component in the data;
taking a part of image data after the interest area is enhanced as training data, and training by utilizing the training data to generate a confrontation network;
and decoding the compressed brightness component, inputting the decoded image into a generation countermeasure network, coloring the image and obtaining a decoded enhanced image.
Optionally, the step of extracting the image region of interest by YOLOv4, converting the RGB color space component into an HSV space component, and then enhancing the region of interest specifically includes the following steps:
detecting people and vehicles in the image by using YOLO v4, and taking the detected objects such as the people and the vehicles as interest areas;
converting an RGB color space image of the interest area into an HSV space image;
separating the irradiated light component and the reflected light component of the V component by a logarithm taking method;
enhancing the V component;
carrying out self-adaptive adjustment on the S component;
after the image is enhanced, the enhanced H, S, V components are reconverted to R, G, B components.
Optionally, the RGB color space image of the interest region is converted into an HSV space image by using a formula:
Figure BDA0003809216070000021
wherein R, G, B is R, G, B component of the image, H, S, V is the component of the image-again-HSV space, H ∈ [0,360], S ∈ [0,1], V ∈ [0,1], tmax is the maximum value in R, G, B, and Tmin is the minimum value in R, G, B.
Optionally, the irradiation light component and the reflected light component of the V component are separated by a logarithm method, and an adopted formula is:
V=L×R
where V is the V component of the image in HSV space, L is ambient illumination image data, and R is reflectance image data.
Optionally, in the step of enhancing the V component, the enhanced V component may be represented as V':
Figure BDA0003809216070000031
in the formula, ω N is a weighting coefficient of the nth scale, and N is the number of scales.
Optionally, in the step of adaptively adjusting the S component, the adjusted S component may be represented as S':
S′=S+t×(V′-V×λ)
in the formula, t is a proportionality constant, and λ is an adaptive coefficient.
Optionally, in the step of compressing the luminance component in the data after the image enhancement, the formula for compressing the luminance component of the enhanced image is as follows:
Y′=C(Y)
where C () is an image encoder, Y is an image luminance signal, and Y' is a compressed luminance signal.
Optionally, the step of training a part of the image data after enhancing the interest region as training data to generate a confrontation network by using the training data specifically includes the following steps:
constructing a chrominance component to generate a confrontation network to color the picture;
the generator loss function is designed.
Optionally, the generator loss function is:
L mixed =a 1 ·L a +a 2 ·L MSE +a 3 L content +a 4 L color
where a1, a2, a3, a4 are loss function weights, where La is the term of the opposing loss:
L a =-log D(G(Y))
where log () is the log function, D () is the image discriminator model, and LMSE is the mean square error loss term:
L MSE =||G(Y)-X|| 2
in the formula, |2 is a2 norm, X is a target color image, and Lcontent is a characteristic loss term;
Figure BDA0003809216070000032
in the formula, | | |1 is a1 norm, cj, hj, wj respectively represent the channel number, length and width of the characteristic diagram,
Figure BDA0003809216070000041
a j-th layer output of the network feature extraction layer;
Figure BDA0003809216070000042
in the formula, G (y) is the generated image, and G0 and Gt are gaussian filters.
Further, the present invention also provides a video compression apparatus based on region of interest enhancement, the apparatus includes a processor and a memory, the processor executes a program corresponding to an executable program code by reading the executable program code stored in the memory, so as to implement the video compression method based on region of interest enhancement as described above.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the interest area is detected by using a YOLOv4 algorithm, the algorithm is high in detection speed, and video data can be processed in real time;
2. the RGB components are converted into HSV components to enhance the region of interest, and the mean value, variance and entropy of the enhanced image are superior to those of the image directly enhanced in the RGB space;
3. the invention only compresses the image brightness component, and designs and generates a countermeasure network at a decoding end to color the image, thereby improving the video compression efficiency;
4. the generation of the anti-network mixing loss function constructed by the invention improves the quality of the enhanced image in the video coloring process.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a flow chart of a video compression method based on region of interest enhancement according to an embodiment of the present invention;
FIG. 2 is a diagram of a structure of a generate confrontation network according to an embodiment of the invention;
fig. 3 is a block diagram of a video compression apparatus based on region of interest enhancement according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will help those skilled in the art to further understand the present invention, and will make the technical solutions of the present invention and their advantages obvious.
The invention provides a video compression method based on region of interest enhancement, which extracts a region of interest in a video frame through a YOLOV4 algorithm, enhances and compresses the video frame, and colors a decoded image through a generation countermeasure network, and specifically, fig. 1 is a flow diagram of the video compression method based on region of interest enhancement provided by the embodiment of the invention, as shown in fig. 1, the method comprises the following steps:
s1: carrying out transformation and quantization processing on the video frame to remove spatial redundant information;
s2: extracting an image interesting region through YOLOv4, converting RGB color space components into HSV space components, and then enhancing the interesting region;
specifically, the step S2 includes the steps of:
step 21: detecting people and vehicles in the image by using YOLO v4, and taking the detected objects such as the people and the vehicles as interest areas;
step 22: converting an RGB color space image of the interest area into an HSV space image;
the concrete formula is as follows:
Figure BDA0003809216070000051
wherein R, G, B is R, G, B component of image respectively, H, S, V is component of image-again-HSV space respectively, H is equal to [0,360], S is equal to [0,1], V is equal to [0,1], tmax is maximum value in R, G, B, and Tmin is minimum value in R, G, B;
step 23: separating the irradiated light component and the reflected light component of the V component by a logarithm taking method;
V=L×R (2)
in the formula, V is a V component of an image in HSV space, L is environment illumination image data, and R is reflection image data;
and step 24: the V component is enhanced, and the enhanced V component can be expressed as V':
Figure BDA0003809216070000061
in the formula, ω N is the weighting coefficient of the nth scale, and N is the number of scales;
step 25: the S component is adaptively adjusted, and the adjusted S component can be represented as S':
S′=S+t×(V′-V×λ) (4)
in the formula, t is a proportionality constant, and lambda is an adaptive coefficient;
step 26: after the image is enhanced, the enhanced H, S, V components are reconverted to R, G, B components.
S3: after image enhancement, compressing the brightness component in the data;
specifically, the luminance component of the enhanced image is compressed:
Y′=C(Y) (5)
where C () is an image encoder, Y is an image luminance signal, and Y' is a compressed luminance signal.
S4: taking a part of image data after the interest area is enhanced as training data, and training by utilizing the training data to generate a confrontation network;
specifically, the generation of the structure diagram of the countermeasure network is shown in fig. 2, and the step S4 includes the following steps:
step 41: constructing a chrominance component to generate a confrontation network to color the picture;
the generator is composed of a multi-scale feature extractor, residual error connection based on an attention mechanism and a regularized feature reconstruction mechanism, and the discriminator adopts a PatchGAN structure.
Step 42: designing a generator loss function;
the concrete formula is as follows:
L mixed =a 1 ·L a +a 2 ·L MSE +a 3 L content +a 4 L color (6)
where a1, a2, a3, a4 are loss function weights, where La is the term of the opposing loss:
L a =-logD(G(Y)) (7)
where log () is the log function, D () is the image discriminator model, and LMSE is the mean square error loss term:
L MSE =||G(Y)-X|| 2 (8)
in the formula, |2 is a2 norm, X is a target color image, and Lcontent is a characteristic loss term;
Figure BDA0003809216070000071
in the formula, | | |1 is a1 norm, cj, hj, wj respectively represent the channel number, length and width of the characteristic diagram,
Figure BDA0003809216070000072
a j-th layer output of the network feature extraction layer;
Figure BDA0003809216070000073
in the formula, G (y) is the generated image, and G0 and Gt are gaussian filters.
S5: and decoding the compressed brightness component, inputting the decoded image into a generation countermeasure network, and coloring the image to obtain a decoded enhanced image.
As shown in fig. 3, the present invention further provides a video compression apparatus based on region of interest enhancement, the apparatus includes a processor 31 and a memory 32, the processor 31 runs a program corresponding to an executable program code by reading the executable program code stored in the memory 32, so as to implement the video compression method based on region of interest enhancement according to the foregoing embodiment.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the interest area is detected by using a YOLOv4 algorithm, the algorithm is high in detection speed, and video data can be processed in real time;
2. the RGB components are converted into HSV components to enhance the region of interest, and the mean value, variance and entropy of the enhanced image are superior to those of the image directly enhanced in the RGB space;
3. the invention only compresses the image brightness component, and designs and generates a countermeasure network at a decoding end to color the image, thereby improving the video compression efficiency;
4. the generation of the anti-network mixing loss function constructed by the invention improves the quality of the enhanced image in the video coloring process.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A method for video compression based on region of interest enhancement, the method comprising the steps of:
carrying out transformation and quantization processing on the video frame to remove spatial redundant information;
extracting an image interesting region through YOLOv4, converting RGB color space components into HSV space components, and then enhancing the interesting region;
after image enhancement, compressing the brightness component in the data;
taking a part of image data after the interest area is enhanced as training data, and training by utilizing the training data to generate a confrontation network;
and decoding the compressed brightness component, inputting the decoded image into a generation countermeasure network, and coloring the image to obtain a decoded enhanced image.
2. The method of claim 1, wherein the step of extracting the image region of interest by YOLOv4, converting RGB color space components into HSV space components, and enhancing the region of interest comprises the following steps:
detecting people and vehicles in the image by using YOLO v4, and taking the detected objects such as the people and the vehicles as interest areas;
converting an RGB color space image of the interest area into an HSV space image;
separating the irradiated light component and the reflected light component of the V component by a logarithm taking method;
enhancing the V component;
carrying out self-adaptive adjustment on the S component;
after the image is enhanced, the enhanced H, S, V components are reconverted to R, G, B components.
3. The method according to claim 2, wherein the region of interest RGB color space image is converted into HSV space image by the following formula:
Figure FDA0003809216060000021
wherein R, G, B is R, G, B component of image respectively, H, S, V is component of image-again-HSV space respectively, H epsilon [0,360], S epsilon [0,1], V epsilon [0,1], tmax is the maximum value in R, G, B, and Tmin is the minimum value in R, G, B.
4. The region of interest enhancement-based video compression method according to claim 2, wherein the V component of the illumination light component and the reflected light component are separated by a logarithm method, and the formula is as follows:
V=L×R
where V is the V component of the image in HSV space, L is ambient illumination image data, and R is reflectance image data.
5. The method of claim 2, wherein in the step of enhancing the V component, the enhanced V component is represented by V':
Figure FDA0003809216060000022
in the formula, ω N is a weighting coefficient of the nth scale, and N is the number of scales.
6. The method of claim 2, wherein in the step of adaptively adjusting the S component, the adjusted S component can be represented as S':
S′=S+t×(V′-V×λ)
in the formula, t is a proportionality constant, and λ is an adaptive coefficient.
7. The method of claim 1, wherein in the step of compressing the luminance component of the data after the image enhancement, the formula for compressing the luminance component of the enhanced image is as follows:
T′=C(Y)
where C () is an image encoder, Y is an image luminance signal, and Y' is a compressed luminance signal.
8. The method according to claim 1, wherein the step of generating the countermeasure network by training with the training data, using a part of the image data after the region of interest enhancement as the training data, comprises the steps of:
constructing a chrominance component to generate a confrontation network to color the picture;
the generator loss function is designed.
9. The method of claim 8, wherein the generator loss function is:
L mixed =a 1 ·L a +a 2 ·L MSE +a 3 L content +a 4 L color
where a1, a2, a3, a4 are loss function weights, where La is the term of the opposing loss:
L a =-log D(G(Y))
where log () is the log function, D () is the image discriminator model, and LMSE is the mean square error loss term:
L MSE =||G(Y)-X|| 2
in the formula, |2 is a2 norm, X is a target color image, and Lcontent is a characteristic loss term;
Figure FDA0003809216060000031
in the formula, | | |1 is a1 norm, cj, hj, wj respectively represent the channel number, length and width of the characteristic diagram,
Figure FDA0003809216060000032
a j-th layer output of the network feature extraction layer;
Figure FDA0003809216060000033
in the formula, G (y) is the generated image, and G0 and Gt are gaussian filters.
10. An apparatus for video compression based on region of interest enhancement, the apparatus comprising a processor and a memory, the processor running a program corresponding to an executable program code stored in the memory by reading the executable program code, for implementing the method of video compression based on region of interest enhancement according to any one of claims 1 to 9.
CN202211006575.5A 2022-08-22 2022-08-22 Video compression method and device based on interest area enhancement Pending CN115396669A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211006575.5A CN115396669A (en) 2022-08-22 2022-08-22 Video compression method and device based on interest area enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211006575.5A CN115396669A (en) 2022-08-22 2022-08-22 Video compression method and device based on interest area enhancement

Publications (1)

Publication Number Publication Date
CN115396669A true CN115396669A (en) 2022-11-25

Family

ID=84120936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211006575.5A Pending CN115396669A (en) 2022-08-22 2022-08-22 Video compression method and device based on interest area enhancement

Country Status (1)

Country Link
CN (1) CN115396669A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604477A (en) * 2022-12-14 2023-01-13 广州波视信息科技股份有限公司(Cn) Ultrahigh-definition video distortion optimization coding method
CN116258653A (en) * 2023-05-16 2023-06-13 深圳市夜行人科技有限公司 Low-light level image enhancement method and system based on deep learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604477A (en) * 2022-12-14 2023-01-13 广州波视信息科技股份有限公司(Cn) Ultrahigh-definition video distortion optimization coding method
CN116258653A (en) * 2023-05-16 2023-06-13 深圳市夜行人科技有限公司 Low-light level image enhancement method and system based on deep learning
CN116258653B (en) * 2023-05-16 2023-07-14 深圳市夜行人科技有限公司 Low-light level image enhancement method and system based on deep learning

Similar Documents

Publication Publication Date Title
CN115396669A (en) Video compression method and device based on interest area enhancement
Sadek et al. Robust video steganography algorithm using adaptive skin-tone detection
EP3354030B1 (en) Methods and apparatuses for encoding and decoding digital images through superpixels
CN110798690B (en) Video decoding method, and method, device and equipment for training loop filtering model
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
CN101360184A (en) System and method for extracting key frame of video
Li et al. Novel image authentication scheme with fine image quality for BTC-based compressed images
US20140212046A1 (en) Bit depth reduction techniques for low complexity image patch matching
CN109389569A (en) Based on the real-time defogging method of monitor video for improving DehazeNet
US11854164B2 (en) Method for denoising omnidirectional videos and rectified videos
Yang et al. Low-light image enhancement based on Retinex theory and dual-tree complex wavelet transform
CN111899193A (en) Criminal investigation photography system and method based on low-illumination image enhancement algorithm
US7106908B2 (en) Method and apparatus for selecting a format in which to re-encode a quantized image
CN111968073B (en) No-reference image quality evaluation method based on texture information statistics
Akbari et al. Image compression using adaptive sparse representations over trained dictionaries
Katakol et al. Distributed learning and inference with compressed images
CN113810654A (en) Image video uploading method and device, storage medium and electronic equipment
Li et al. Efficient visual computing with camera raw snapshots
CN114066914A (en) Image processing method and related equipment
GB2299912A (en) Fractal image compression device and method using perceptual distortion measure
Liang et al. Multi-scale and multi-patch transformer for sandstorm image enhancement
Haffner et al. Color documents on the Web with DjVu
CN115619677A (en) Image defogging method based on improved cycleGAN
Cao et al. Oodhdr-codec: Out-of-distribution generalization for hdr image compression
WO2022226850A1 (en) Point cloud quality enhancement method, encoding and decoding methods, apparatuses, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination