WO2023005699A1 - Procédé et dispositif de formation de réseau d'amélioration vidéo, et procédé et dispositif d'amélioration vidéo - Google Patents

Procédé et dispositif de formation de réseau d'amélioration vidéo, et procédé et dispositif d'amélioration vidéo Download PDF

Info

Publication number
WO2023005699A1
WO2023005699A1 PCT/CN2022/106156 CN2022106156W WO2023005699A1 WO 2023005699 A1 WO2023005699 A1 WO 2023005699A1 CN 2022106156 W CN2022106156 W CN 2022106156W WO 2023005699 A1 WO2023005699 A1 WO 2023005699A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
layer
video frame
network
enhanced
Prior art date
Application number
PCT/CN2022/106156
Other languages
English (en)
Chinese (zh)
Inventor
崔同兵
黄志杰
Original Assignee
广州安思创信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州安思创信息技术有限公司 filed Critical 广州安思创信息技术有限公司
Publication of WO2023005699A1 publication Critical patent/WO2023005699A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the embodiments of the present application relate to the technical field of video processing, for example, to a video enhancement network training method, a video enhancement method and a device.
  • video compression/encoding can reduce storage space and transmission bandwidth. It plays a vital role.
  • Video compression will cause various distortions such as block effect and blur in the compressed video, which seriously affects people's video viewing experience.
  • neural networks are widely used in video quality improvement.
  • more complex and deeper networks are often used to extract image features, but complex and deep neural networks run slowly, and for video enhancement tasks, the network speed is also very high.
  • slow neural networks limit the application of image enhancement networks to video quality enhancement tasks.
  • the neural network used for video enhancement in the related art cannot balance the video enhancement quality and running speed.
  • the embodiment of the present application provides a video enhancement network training method, video enhancement method, device, electronic equipment and storage medium, so as to avoid the situation that the neural network used for video enhancement in the related art cannot take into account the video enhancement quality and running speed.
  • the embodiment of the present application provides a video enhancement network training method, including:
  • the video enhancement network includes an input layer, an output layer, and a plurality of dense residual subnetworks between the input layer and the output layer, and each of the dense residual subnetworks includes a downsampling layer, an upper A sampling layer and a plurality of convolutional layers located between the downsampling layer and the upsampling layer, the input feature of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • the embodiment of the present application provides a video enhancement method, including:
  • the video data to be enhanced includes multiple frames of video frames
  • the video enhancement network is trained by the video enhancement network training method described in the first aspect.
  • the embodiment of the present application provides a video enhancement network training device, including:
  • the training data acquisition module is configured to acquire the first video frame and the second video frame used for training, and the second video frame is a video frame after the enhanced processing of the first video frame;
  • a network building block configured to construct a video augmentation network
  • a network training module configured to train the video enhancement network using the first video frame and the second video frame
  • the video enhancement network includes an input layer, an output layer, and a plurality of dense residual subnetworks between the input layer and the output layer, and each of the dense residual subnetworks includes a downsampling layer, an upper A sampling layer and a plurality of convolutional layers located between the downsampling layer and the upsampling layer, the input feature of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • the embodiment of the present application provides a video enhancement device, including:
  • the video data acquisition module to be enhanced is configured to acquire video data to be enhanced, and the video data to be enhanced includes multi-frame video frames;
  • the video enhancement module is configured to input the video frame into the enhanced video frame obtained in the pre-trained video enhancement network;
  • a splicing module configured to splice the enhanced video frames into enhanced video data
  • the video enhancement network is trained by the video enhancement network training method described in the first aspect.
  • an embodiment of the present application provides an electronic device, the electronic device comprising:
  • processors one or more processors
  • storage means configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the video enhancement network training method described in the first aspect of the present application, and/or, the second aspect The described video enhancement method.
  • the embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the video enhancement network training method described in the first aspect of the present application is implemented, and/or , the video enhancement method described in the second aspect.
  • Fig. 1 is a flow chart of the steps of a video enhancement network training method provided by an embodiment of the present application
  • FIG. 2A is a flow chart of the steps of a video enhancement network training method provided by another embodiment of the present application.
  • Fig. 2B is a schematic diagram of the dense residual subnetwork in the embodiment of the present application.
  • FIG. 2C is a schematic structural diagram of a video enhancement network according to an embodiment of the present application.
  • Fig. 3 is a flow chart of steps of a video enhancement method provided by an embodiment of the present application.
  • Fig. 4 is a structural block diagram of a video enhancement network training device provided by an embodiment of the present application.
  • Fig. 5 is a structural block diagram of a video enhancement device provided by an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 1 is a flow chart of the steps of a video enhancement network training method provided by an embodiment of the present application.
  • the embodiment of the present application is applicable to the situation where the video enhancement network is trained to enhance the video.
  • the method can be implemented by the embodiment of the present application.
  • Video enhanced network training device to perform, the video enhanced network training device can be implemented by hardware or software, and integrated in the electronic equipment provided by the embodiment of the application, for example, as shown in Figure 1, the video of the embodiment of the application
  • the enhanced network training method may include the following steps:
  • the first video frame can be the video frame used to input the video enhancement network during training
  • the second video frame can be the video frame used as the label during training, that is, the second video frame can be the first video frame after the enhancement process The resulting video frame.
  • video data is composed of multiple video frames, and the video data is coded and compressed at the sending end before network transmission, and decoded when the receiving end receives the coded and compressed video data. Since the video data is encoded and decoded, the decoded video data is distorted to a certain extent, then multiple video frames can be extracted from the decoded video data as the first video frame for training, and the encoded video frame before compression The undistorted video frame in the video data is used as the second video frame. Certainly, the enhanced video frame obtained after artificially enhancing the first video frame may also be used as the second video frame.
  • the video enhancement network of the embodiment of the present application includes an input layer, an output layer, and a plurality of dense residual subnetworks between the input layer and the output layer, and each dense residual subnetwork includes a downsampling layer, an upsampling layer, and a Multiple convolutional layers located between the downsampling layer and the upsampling layer, the input feature of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • the input and output layers may be convolutional layers.
  • Each dense residual sub-network sets a downsampling layer, which enables all feature operations to be performed under downsampling, reducing the complexity of the video enhancement network.
  • the input of each convolutional layer in the dense residual sub-network is the sum of the output features of all layers before the convolutional layer, which realizes feature multiplexing, improves the transmission capability of features when the signal is sparse, and avoids feature loss. , which improves the recovery quality of video frames.
  • the first video frame is input to the input layer, it undergoes convolution processing to obtain a shallow feature map.
  • the shallow feature map is input into the first dense residual sub-network and then down-sampled to obtain a down-sampled feature map.
  • the input feature of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • the video enhancement network outputs the enhanced enhanced video frame, and adjusts the parameters of the video enhancement network by calculating the loss rate of the enhanced video frame and the second video frame until the video enhancement network converges or the number of training times reaches the preset number of times to obtain a trained video.
  • An enhanced network the trained video enhanced network is used to output the enhanced video frame when the video frame to be enhanced is input.
  • the video enhancement network of the embodiment of the present application includes a plurality of dense residual sub-networks, and each dense residual sub-network includes a downsampling layer, and all features are extracted under downsampling, which reduces the complexity of the video enhancement network and improves
  • the speed of the video enhancement network is improved, and the input feature of each convolutional layer in the dense residual sub-network is the sum of the output features of all layers before the convolutional layer, which realizes feature multiplexing and can be used in the case of sparse signals.
  • the feature transmission capability is improved, and high-quality video frames can be recovered, that is, the video enhancement network in the embodiment of the present application can take both video enhancement quality and running speed into consideration.
  • Fig. 2A is a flow chart of the steps of a video enhancement network training method provided by another embodiment of the present application.
  • the embodiment of the present application is refined on the basis of the foregoing embodiments.
  • the video enhancement network training method can comprise the steps:
  • video data is composed of multiple frames of video frames, and the video data is coded and compressed by the sending end before network transmission, and decoded when the receiving end receives the coded and compressed video data. Since the video data is encoded and decoded, the decoded video data is distorted to a certain extent. Multiple video frames can be extracted from the decoded video data as the first video frame for training, and the video before encoding The unencoded and compressed video frame in the data is used as the second video frame. Certainly, the enhanced video frame obtained after artificially enhancing the first video frame may also be used as the second video frame.
  • the dense residual sub-network can be a network containing multiple convolutional layers.
  • the input of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • each dense residual sub-network multiple sequentially connected convolutional layers are constructed, wherein the output features of each convolutional layer are summed with the output features of all layers before the convolutional layer
  • a downsampling layer is connected before the first convolutional layer and an upsampling layer is connected after the last convolutional layer
  • the second addition is connected after the upsampling layer
  • the second adder is used to add the output features of the up-sampling layer and the input features of the down-sampling layer as the output features of the dense residual sub-network.
  • the downsampling layer can be bilinear interpolation sampling
  • the convolution kernel size of each convolution layer can be 3 ⁇ 3
  • ⁇ ( ) is the activation function
  • W, b are the weights and offset coefficients of the convolutional layer
  • F i is the feature obtained after convolution.
  • FIG. 2B a schematic diagram of a dense residual sub-network is shown in Figure 2B.
  • the input feature F in is passed through the downsampling layer to obtain a downsampling feature map F 0
  • the downsampling feature map F 0 is passed through the first
  • a convolutional layer outputs the feature map F 1
  • the downsampled feature map F 0 and the feature map F 1 can be concatenated as the input feature of the second convolutional layer
  • the second convolutional layer outputs the feature map F 2
  • concatenate the feature maps F 0 , F 1 , and F 2 as the input features of the third convolutional layer, and so on.
  • the splicing of two or more feature maps may be the splicing of feature maps with the same size on the channel.
  • feature map A is H ⁇ W ⁇ C A
  • feature map B is H ⁇ W ⁇ C B
  • the feature map obtained by splicing feature map A and feature map B is H ⁇ W ⁇ (C A +C B ) , where H is the height of the feature map, W is the width of the feature map, and C is the channel value.
  • the feature map F d is up-sampled to obtain an up-sampled feature map with the same size as the input feature F in , and finally the up-sampled feature map and the input feature map F in pass through the second adder After SUM2, the output feature F out of the dense residual sub-network is obtained, and the output feature F out is used as the input feature F in of the next dense residual sub-network.
  • the second adder is used for adding pixel values of corresponding pixel points in the input feature map F in and the upsampling feature map.
  • the upsampling layer performs pixel rearrangement on the output feature map of the last convolutional layer through a preset pixel rearrangement algorithm to obtain an upsampled feature map with the same size as the input feature map of the downsampling layer.
  • the pixel shuffling (PixelShuffle) algorithm converts a low-resolution input image (Low Resolution) with a size of H ⁇ W into a high-resolution image (High Resolution) of rH ⁇ rW through Sub-pixel operation, where , r is the upsampling factor, that is, the magnification from low resolution to high resolution.
  • the upsampling layer uses PixelShuffle to obtain feature maps of 2 n ⁇ C channels through periodic screening. The method obtains a high-resolution feature map with the number of channels C.
  • an input layer C_in is connected before the first dense residual sub-network SDRB 1 .
  • the input layer C_in may be a convolutional layer with a convolution kernel equal to 3 ⁇ 3, so as to perform a convolution operation on the input image to obtain a shallow feature F in to be input into the first dense residual sub-network SDRB 1 .
  • an input layer C_out is connected after the last dense residual sub-network SDRB N.
  • the input layer C_out may be a convolutional layer with a convolution kernel equal to 3 ⁇ 3, so as to linearly transform the output features of the last dense residual sub-network SDRB N to obtain a residual map.
  • the first adder SUM1 is connected after the output layer C_out of the video enhancement network, the input of the first adder SUM1 is the residual map output by the output layer C_out and the input image I of the input layer C_in, the first An adder SUM1 adds the residual map output by the output layer C_out to the pixel value of the corresponding pixel in the input image I to output the enhanced video frame O.
  • the number of pixel bits of the first video frame can be obtained, the pixel value corresponding to the number of pixel bits can be calculated as the maximum pixel value of the first video frame, and the difference between the maximum pixel value and 1 can be calculated, for the first
  • the pixel value of each pixel in the video frame calculate the ratio of the pixel value to the difference as the normalized pixel value of each pixel, for example, the formula for normalization is as follows:
  • the input feature F in shown in Figure 2B is obtained after the normalized first video frame I is input into the input layer, and the input feature F in is sequentially processed in multiple dense residual sub-networks Transmission in SDRB N.
  • the input feature F in is first sampled by the downsampling layer, and then sequentially transmitted in the convolutional layer of the dense residual sub-network SDRB N , each convolutional layer
  • the input feature of the input feature is the sum of the output features of all layers before the convolutional layer, and the output of the last convolutional layer passes through the upsampling layer and then outputs the upsampling feature.
  • the output feature F out is used as the input feature F in of the next dense residual sub-network, and the output feature of the last dense residual sub-network SDRB N is linearly transformed through the output layer C_out to obtain a residual map.
  • the first adder SUM1 adds the residual map output by the output layer C_out to the pixel value of the corresponding pixel in the input image I to output the enhanced video frame O.
  • the loss function is the mean square error loss function, as shown in the following formula:
  • Y is the unencoded and compressed video frame, that is, the second video frame
  • O is the video frame output by the video enhancement network
  • the size of the training video can be 32
  • the training can use the Adam optimizer
  • the initial learning rate can be set to 10- 4.
  • those skilled in the art can also use other loss functions to calculate the loss rate, and the embodiment of the present application does not limit the way of calculating the loss rate.
  • the number of iterative training can also be counted, and when the number reaches the preset number, the iterative training of the video enhancement network is stopped to obtain a trained video enhancement network.
  • the parameters of the video enhancement network can also be divided into multiple sections, so as to train and adjust the parameters of each section respectively, and inherit the trained parameters to the untrained parameters to improve the training performance. speed.
  • the video enhancement network of the embodiment of the present application includes a plurality of dense residual sub-networks, and each dense residual sub-network includes a downsampling layer, and all features are extracted under downsampling, which reduces the complexity of the video enhancement network and improves
  • the speed of the video enhancement network is improved, and the input feature of each convolutional layer in the dense residual sub-network is the sum of the output features of all layers before the convolutional layer, which realizes feature multiplexing and can be used in the case of sparse signals.
  • the feature transmission capability is improved, and high-quality video frames can be recovered, that is, the video enhancement network in the embodiment of the present application can take both video enhancement quality and running speed into consideration.
  • Fig. 3 is a flow chart of the steps of a video enhancement method provided by the embodiment of the present application.
  • the embodiment of the present application is applicable to the case of enhancing decompressed video data, and the method can be executed by the video enhancement device of the embodiment of the present application.
  • the video enhancement device may be implemented by hardware or software, and integrated into the electronic device provided by the embodiment of the present application.
  • the video enhancement method of the embodiment of the present application may include the following steps:
  • the video data to be enhanced is composed of multiple video frames
  • the video enhancement may be to perform image processing on the video frames in the video data.
  • the video enhancement may be image processing including defogging, contrast enhancement, lossless magnification, stretch recovery, etc., capable of realizing high-definition video reconstruction.
  • the video data obtained by decoding before the video data is played has distortion phenomena, such as block effects, blurring and other distortions, so it is necessary to enhance the decoded video data, then it can be
  • the compressed video data is decoded to obtain the video data to be enhanced.
  • the video data to be enhanced can also be other video data.
  • the video data recorded by the camera can be used as the video data to be enhanced to improve the video data in the live broadcast scene due to light, equipment, etc. Due to the fact that the quality of the operating video is poor, the embodiment of the present application does not limit the manner of acquiring the video data to be enhanced.
  • the embodiment of the present application can pre-train the video enhancement network. After inputting a video frame, the video enhancement network can output the enhanced video frame.
  • the video enhancement network training method provided in the foregoing embodiments can be used to train video enhancement.
  • the specific training process of the network reference may be made to the foregoing embodiments, and no further details are given here.
  • the enhanced video frames can be spliced into enhanced video data according to the playing sequence of the video frames in the video data.
  • the playback time stamp of each video frame in the video data may be recorded, and each enhanced video frame may be spliced according to the playback time stamp to obtain enhanced video data.
  • the embodiment of the present application can embed the video enhancement network between the decoder and the player, the decoder does not decode a frame of video and then inputs it into the video enhancement network, and the video enhancement network outputs the enhanced video frame to The player plays in real time without splicing the enhanced video frames.
  • video data to be enhanced is obtained, video frames of the video data are input into a pre-trained video enhancement network to obtain enhanced video frames, and the enhanced video frames are spliced into enhanced video data.
  • the video enhancement network used for video enhancement includes multiple dense residual subnetworks, each of which includes a downsampling layer, and all features are extracted under downsampling, which reduces the complexity of the video enhancement network , which improves the running speed of the video enhancement network
  • the input feature of each convolutional layer in the dense residual sub-network is the sum of the output features of all layers before the convolutional layer, which realizes feature multiplexing and can be used in sparse signal
  • the feature transmission capability is improved, and high-quality video frames can be restored, that is, the video enhancement network in the embodiment of the present application can take both video enhancement quality and running speed into consideration.
  • Fig. 4 is a structural block diagram of a video enhancement network training device provided by the embodiment of the present application. As shown in Fig. 4, the video enhancement network training device of the embodiment of the present application includes:
  • the training data acquisition module 401 is configured to obtain the first video frame and the second video frame used for training, and the second video frame is a video frame after the first video frame enhancement process;
  • a network construction module 402 configured to construct a video enhancement network
  • a network training module 403, configured to use the first video frame and the second video frame to train the video enhancement network
  • the video enhancement network includes an input layer, an output layer, and a plurality of dense residual subnetworks between the input layer and the output layer, and each of the dense residual subnetworks includes a downsampling layer, an upper A sampling layer and a plurality of convolutional layers located between the downsampling layer and the upsampling layer, the input feature of each convolutional layer is the sum of the output features of all layers before the convolutional layer.
  • the video-enhanced network training device provided in the embodiment of the present application can execute the video-enhanced network training method provided in the foregoing embodiments of the present application, and has corresponding functional modules and beneficial effects for executing the method.
  • Fig. 5 is a structural block diagram of a video enhancement device provided in the embodiment of the present application. As shown in Fig. 5, the video enhancement device in the embodiment of the present application may include the following modules:
  • the video data acquisition module 501 to be enhanced is configured to acquire video data to be enhanced, and the video data to be enhanced includes multi-frame video frames;
  • the video enhancement module 502 is configured to input the video frame into the enhanced video frame obtained in the pre-trained video enhancement network;
  • the splicing module 503 is configured to splice the enhanced video frames into enhanced video data
  • the video enhancement network is trained by the video enhancement network training method described in the foregoing embodiments.
  • the video enhancement device provided in the embodiment of the present application can execute the video enhancement method provided in the embodiment of the present application, and has corresponding functional modules and beneficial effects for executing the method.
  • the electronic device may include: a processor 601 , a storage device 602 , a display screen 603 with a touch function, an input device 604 , an output device 605 and a communication device 606 .
  • the number of processors 601 in the electronic device may be one or more, and one processor 601 is taken as an example in FIG. 6 .
  • the processor 601 , storage device 602 , display screen 603 , input device 604 , output device 605 and communication device 606 of the electronic device may be connected via a bus or in other ways. In FIG. 6 , connection via a bus is taken as an example.
  • the electronic device is configured to execute the video enhancement network training method provided in any embodiment of the present application, and/or the video enhancement method.
  • the embodiment of the present application also provides a computer-readable storage medium, when the instructions in the storage medium are executed by the processor of the device, the device can execute the video enhancement network training method as described in the above method embodiment, and/or , a video enhancement method.
  • the computer readable storage medium may be a non-transitory computer readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Des modes de réalisation de la présente demande concernent un procédé et un dispositif de formation de réseau d'amélioration vidéo, ainsi qu'un procédé et un dispositif d'amélioration vidéo. Le procédé de formation de réseau d'amélioration vidéo consiste à : obtenir une première trame vidéo et une seconde trame vidéo pour formation ; construire un réseau d'amélioration vidéo ; et former le réseau d'amélioration vidéo à l'aide de la première trame vidéo et de la seconde trame vidéo. Le réseau d'amélioration vidéo comprend une couche d'entrée, une couche de sortie, et une pluralité de sous-réseaux résiduels denses situés entre la couche d'entrée et la couche de sortie. Chaque sous-réseau résiduel dense comprend une couche d'échantillonnage inférieure, une couche d'échantillonnage supérieure, et une pluralité de couches de convolution situées entre la couche d'échantillonnage supérieure et la couche d'échantillonnage inférieure. Une caractéristique d'entrée de chaque couche de convolution est la somme de caractéristiques de sortie de toutes les couches avant la couche de convolution.
PCT/CN2022/106156 2021-07-29 2022-07-18 Procédé et dispositif de formation de réseau d'amélioration vidéo, et procédé et dispositif d'amélioration vidéo WO2023005699A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110866688.1 2021-07-29
CN202110866688.1A CN113538287B (zh) 2021-07-29 2021-07-29 视频增强网络训练方法、视频增强方法及相关装置

Publications (1)

Publication Number Publication Date
WO2023005699A1 true WO2023005699A1 (fr) 2023-02-02

Family

ID=78089767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/106156 WO2023005699A1 (fr) 2021-07-29 2022-07-18 Procédé et dispositif de formation de réseau d'amélioration vidéo, et procédé et dispositif d'amélioration vidéo

Country Status (2)

Country Link
CN (1) CN113538287B (fr)
WO (1) WO2023005699A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117204910A (zh) * 2023-09-26 2023-12-12 北京长木谷医疗科技股份有限公司 基于深度学习的膝关节位置实时追踪的自动截骨方法
CN117590761A (zh) * 2023-12-29 2024-02-23 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538287B (zh) * 2021-07-29 2024-03-29 广州安思创信息技术有限公司 视频增强网络训练方法、视频增强方法及相关装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724309A (zh) * 2019-03-19 2020-09-29 京东方科技集团股份有限公司 图像处理方法及装置、神经网络的训练方法、存储介质
CN112288658A (zh) * 2020-11-23 2021-01-29 杭州师范大学 一种基于多残差联合学习的水下图像增强方法
CN112419219A (zh) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 图像增强模型训练方法、图像增强方法以及相关装置
CN112801904A (zh) * 2021-02-01 2021-05-14 武汉大学 一种基于卷积神经网络的混合退化图像增强方法
CN113538287A (zh) * 2021-07-29 2021-10-22 广州安思创信息技术有限公司 视频增强网络训练方法、视频增强方法及相关装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108235058B (zh) * 2018-01-12 2021-09-17 广州方硅信息技术有限公司 视频质量处理方法、存储介质和终端
CN109785252B (zh) * 2018-12-25 2023-03-24 山西大学 基于多尺度残差密集网络夜间图像增强方法
CN111080575B (zh) * 2019-11-22 2023-08-25 东南大学 一种基于残差密集u形网络模型的丘脑分割方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724309A (zh) * 2019-03-19 2020-09-29 京东方科技集团股份有限公司 图像处理方法及装置、神经网络的训练方法、存储介质
CN112288658A (zh) * 2020-11-23 2021-01-29 杭州师范大学 一种基于多残差联合学习的水下图像增强方法
CN112419219A (zh) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 图像增强模型训练方法、图像增强方法以及相关装置
CN112801904A (zh) * 2021-02-01 2021-05-14 武汉大学 一种基于卷积神经网络的混合退化图像增强方法
CN113538287A (zh) * 2021-07-29 2021-10-22 广州安思创信息技术有限公司 视频增强网络训练方法、视频增强方法及相关装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117204910A (zh) * 2023-09-26 2023-12-12 北京长木谷医疗科技股份有限公司 基于深度学习的膝关节位置实时追踪的自动截骨方法
CN117590761A (zh) * 2023-12-29 2024-02-23 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统
CN117590761B (zh) * 2023-12-29 2024-04-19 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统

Also Published As

Publication number Publication date
CN113538287A (zh) 2021-10-22
CN113538287B (zh) 2024-03-29

Similar Documents

Publication Publication Date Title
WO2023005699A1 (fr) Procédé et dispositif de formation de réseau d'amélioration vidéo, et procédé et dispositif d'amélioration vidéo
CN113205456B (zh) 一种面向实时视频会话业务的超分辨率重建方法
WO2017084258A1 (fr) Procédé de réduction de bruit de vidéo en temps réel dans un processus de codage, terminal, et support de stockage non volatile lisible par ordinateur
WO2021254139A1 (fr) Procédé et dispositif de traitement vidéo et support d'enregistrement
CN110798690A (zh) 视频解码方法、环路滤波模型的训练方法、装置和设备
WO2023246923A1 (fr) Procédé de codage vidéo, procédé de décodage vidéo et dispositif électronique et support de stockage
CN110751597A (zh) 基于编码损伤修复的视频超分辨方法
KR20210018668A (ko) 딥러닝 신경 네트워크를 사용하여 다운샘플링을 수행하는 이미지 처리 시스템 및 방법, 영상 스트리밍 서버 시스템
KR20190117691A (ko) Hdr 이미지를 재구성하기 위한 방법 및 디바이스
CN111696039A (zh) 图像处理方法及装置、存储介质和电子设备
Ho et al. Down-sampling based video coding with degradation-aware restoration-reconstruction deep neural network
CN113747242B (zh) 图像处理方法、装置、电子设备及存储介质
WO2023050720A1 (fr) Procédé de traitement d'image, appareil de traitement d'image et procédé de formation de modèle
WO2022266955A1 (fr) Procédé et appareil de décodage d'images, procédé et appareil de traitement d'images, et dispositif
CN116797462A (zh) 基于深度学习的实时视频超分辨率重建方法
WO2022156688A1 (fr) Procédés et appareils de codage et décodage en couches
CN114240750A (zh) 视频分辨率提升方法及装置、存储介质及电子设备
CN115967784A (zh) 基于mipi csi c-phy协议的图像传输处理系统及处理方法
CN115376188B (zh) 一种视频通话处理方法、系统、电子设备及存储介质
Zhang et al. An efficient depth map filtering based on spatial and texture features for 3D video coding
TWI822032B (zh) 影片播放系統、可攜式影片播放裝置及影片增強方法
CN117237259B (zh) 基于多模态融合的压缩视频质量增强方法及装置
CN114205646B (zh) 数据处理方法、装置、电子设备及存储介质
US20240095878A1 (en) Method, electronic device, and computer program product for video processing
US11948275B2 (en) Video bandwidth optimization within a video communications platform

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22848304

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE