WO2022127565A1 - 一种视频处理方法、装置以及设备 - Google Patents

一种视频处理方法、装置以及设备 Download PDF

Info

Publication number
WO2022127565A1
WO2022127565A1 PCT/CN2021/133650 CN2021133650W WO2022127565A1 WO 2022127565 A1 WO2022127565 A1 WO 2022127565A1 CN 2021133650 W CN2021133650 W CN 2021133650W WO 2022127565 A1 WO2022127565 A1 WO 2022127565A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
data stream
compression
original
compressed
Prior art date
Application number
PCT/CN2021/133650
Other languages
English (en)
French (fr)
Inventor
傅蓉蓉
徐宇啸
徐攀
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022127565A1 publication Critical patent/WO2022127565A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream

Definitions

  • the present application relates to the field of communication technologies, and in particular, to a video processing method, apparatus, and device.
  • Video processing includes video encoding.
  • Video encoding can use the characteristics of images to eliminate data redundancy in images in videos, achieve the effect of video compression, reduce the space occupied by videos, and significantly reduce the size of encoded videos.
  • the size of the video can also be reduced by means of rate control.
  • the rate control based on the region of interest is one of the more common ones.
  • different compression ratios are configured for different regions of the image in the video.
  • the area with higher sensitivity (that is, the area of interest) is configured with a smaller compression ratio, and the image of this area is guaranteed as complete as possible, and the area with low human eye sensitivity (that is, the area of interest) is configured with a larger Compression ratio to minimize the space occupied by the image in this area after encoding.
  • the present application provides a video processing method, apparatus, and device, so as to provide a high-quality video processing method.
  • an embodiment of the present application provides a video processing method.
  • the video processing method is performed by a video processing system.
  • the video processing system first obtains a compressed video to be processed.
  • the compressed video includes A plurality of data segments obtained using the first compression method and the second compression method.
  • the video processing system can decode the compressed video, and decompose into two videos, which are the first video and the second video respectively.
  • the first video includes the result of obtaining the data segment by using the first compression method in decoding the compressed video;
  • the second video includes the result of decoding the compressed video to obtain the data segment using the second compression method; after the first video and the second video are obtained.
  • the video processing system can perform video restoration according to the first video and the second video, and generate a third video similar to the original video, and the image sequence of the third video and the original video is the same.
  • the video processing system can use two different compression methods to compress the original video to obtain a compressed video including high-resolution and the same frame rate as the original video;
  • the compressed video is decompressed to obtain a high-resolution video and a video with a constant frame rate, and then a third video that is closer to the original video is restored according to the above two videos, and the video restoration degree is higher.
  • the video processing system may first compress the original video to obtain the compressed video. Specifically, the video processing system may use the first compression method and the second compression method to compress the original video respectively to obtain the compressed video.
  • the first compression method is a compression method of down-sampling the original video; The time interval to extract the image compression method from the original video.
  • the compressed video obtained by combining the two different compression modes can include a video with a high resolution and a constant frame rate, which is convenient for the video processing system to restore the third video by using the compressed video.
  • the two compression methods may be used to obtain two data streams including multiple data segments.
  • the data stream obtained by compressing the original video by using the first compression method is the first data stream
  • the data stream obtained by compressing the original video by using the second compression method is the second data stream.
  • the video processing system can obtain the compressed video according to the first data stream and the second data stream.
  • the video processing system can use different compression methods to obtain two data streams, and obtain compressed video by mixing the two data streams, so that the compressed video carries data that is closer to the original video, and the compressed video is generated in a better way. It is convenient and can improve the compression efficiency of the original video.
  • the video processing system when the video processing system uses the first compression method to compress the original video to generate the first data stream, it may first downsample the images in the original video to generate a fourth video, where the fourth video is For the video with the same frame rate as the original video, the fourth video is encoded to generate the first data stream.
  • the video processing system can also downsample the images in the original video to generate the fourth video, filter the images in the fourth video to obtain the fifth video, and then encode the fifth video to generate the first data stream.
  • the video processing system can generate the first data stream in a down-sampling manner, so as to ensure the acquisition of video data with a constant frame rate.
  • the video processing system when it compresses the original video by the second compression method, and generates the second data stream, it may extract images from the original video according to preset time intervals, and form the extracted images into a sixth video, the sixth video is a video with the same resolution as the original video; after the sixth video is obtained, the sixth video can be encoded to generate a second data stream.
  • the video processing system can obtain the second data stream relatively quickly by extracting images and encoding, which is highly efficient and can effectively improve the video compression efficiency.
  • the compressed video further includes an identifier, and the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • the compressed video when the video processing system decodes the compressed video, the compressed video may be decomposed into a first data stream and a second data stream; after that, the first data stream and the second data stream are respectively Decoding is performed to obtain the first video and the second video.
  • the video processing system can decompose the compressed video into two different data streams, and obtain two videos by simply decoding the two data streams.
  • the video processing system may obtain a seventh video with a higher resolution according to the second video, and the resolution of the images in the seventh video
  • the resolution of the image higher than that of the second video is not limited here to obtain the seventh video through the second video.
  • multiple frames can be obtained by performing single-image super-resolution reconstruction on the multiple frames of images in the second video.
  • a high-resolution image, and the multiple frames of high-resolution images constitute a seventh video.
  • the video processing system may generate a third video that is closer to the original video according to the first video and the seventh video, and the difference between the resolution of the third video and the resolution of the original video is less than the first threshold, and the first video The difference between the frame rate of the three videos and the frame rate of the original video is less than the second threshold.
  • the video processing system can obtain the third video with less difference from the original video, and the video restoration degree is high.
  • an embodiment of the present application provides a video processing method, and the beneficial effects can be found in the description of the first aspect, which will not be repeated here.
  • the method is performed by a decompression device.
  • the decompression device first obtains the compressed video to be processed, and the compressed video includes a plurality of data segments obtained by using the first compression method and the second compression method for the original video respectively; the decompression device
  • the first video and the second video can be obtained by decoding the compressed video, and the first video includes the result of decoding the compressed video to obtain the data segment using the first compression method;
  • the second video includes decoding the compressed video to obtain the data segment using the second compression method.
  • the decompression device may determine a third video according to the first video and the second video, and the third video is the same as the image sequence of the original video.
  • the decompression device when the decompression device decodes the compressed video to obtain the first video and the second video, the compressed video may be decomposed into the first data stream and the second data stream; Decode to generate a first video; decode the second data stream to generate a second video.
  • a seventh video may be obtained according to the second video, and the resolution of the images in the seventh video is higher than that of the second video Then, generate a third video according to the first video and the seventh video, the difference between the resolution of the seventh video and the resolution of the original video is less than the first threshold, the frame rate of the third video and the original video The frame rate difference is less than the second threshold.
  • the compressed video further includes an identifier, and the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • the decompression device may decompose the compressed video into the first data stream and the second data stream according to an identifier in the compressed video .
  • an embodiment of the present application provides a video processing method, and the beneficial effects can be found in the description of the first aspect, which will not be repeated here.
  • the method is performed by a compression device.
  • the compression device can compress the original video by using a first compression mode and a second compression mode respectively to obtain a compressed video, wherein the first compression mode is a compression mode of down-sampling the original video.
  • the second compression mode is a compression mode in which images are extracted from the original video according to a preset time interval.
  • the compressing apparatus compresses the original video in a first compression manner to obtain a first data stream, and the first data stream includes at least one data segment; the compression apparatus compresses the original video in a second compression manner to obtain a second data stream , the second data stream includes at least one data segment; after obtaining the first data stream and the second data stream, the compressing apparatus may obtain a compressed video according to the first data stream and the second data stream.
  • the compression device when the compression device uses the first compression method to compress the original video to generate the first data stream, it may downsample the images in the original video to generate the fourth video; The images in the video are screened to obtain a fifth video; the fifth video is encoded to generate a first data stream.
  • the compression device when the compression device uses the second compression method to compress the original video to generate the second data stream, it may extract images from the original video according to preset time intervals to generate the sixth video; The video is encoded to generate a second data stream.
  • the compressed video further includes an identifier, and the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • an embodiment of the present application provides a video processing apparatus, and the beneficial effects can be found in the description of the first aspect, which will not be repeated here.
  • the video processing device includes an acquisition unit, a decoding unit and a restoration unit:
  • the obtaining unit is configured to obtain the compressed video to be processed, where the compressed video includes a plurality of data segments obtained by using the first compression mode and the second compression mode respectively for the original video.
  • a decoding unit configured to decode the compressed video to obtain a first video and a second video, where the first video includes the result obtained by using the first compression method in decoding the compressed video; the second video includes the result obtained by using the second compression method in decoding the compressed video The result of the data segment.
  • the restoration unit is configured to determine a third video according to the first video and the second video, where the image sequence of the third video is the same as that of the original video.
  • the apparatus further includes a first compression unit, a second compression unit and a mixing unit.
  • the first compression unit is used for compressing the original video by adopting a first compression mode, and the first compression mode is a compression mode of down-sampling the original video.
  • the second compression unit is used for compressing the original video by using a second compression method, and the second compression method is a compression method in which images are extracted from the original video according to a preset time interval.
  • the mixing unit is configured to obtain the compressed video according to the data after the original video is compressed by the first compression unit and the second compression unit.
  • the data obtained by the first compression unit compressing the original video in the first compression mode is the first data stream, and the first data stream includes at least one data segment;
  • the second compression unit compresses the data in the second compression mode
  • the data obtained from the original video is a second data stream, and the second data stream includes at least one data segment;
  • the mixing unit can obtain the compressed video according to the first data stream and the second data stream.
  • the first compression unit when the first compression unit uses the first compression method to compress the original video to generate the first data stream, it may first downsample the images in the original video to generate the fourth video; The images in the fourth video are screened to obtain the fifth video; after that, the fifth video is encoded to generate the first data stream.
  • the second compression unit when the second compression unit compresses the original video by the second compression method to generate the second data stream, it may extract images from the original video according to a preset time interval, and combine the extracted images into a first data stream. six videos; after that, the second compression unit encodes the sixth video to generate a second data stream.
  • the compressed video further includes an identifier, where the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • the decoding unit when decoding the compressed video to obtain the first video and the second video, may decompose the compressed video into the first data stream and the second data stream; The data stream is decoded to obtain the first video and the second video.
  • the restoration unit may obtain a seventh video according to the second video, and the resolution of the images in the seventh video is higher than that of the second video.
  • the resolution of the image after that, a third video is generated according to the first video and the seventh video, the difference between the resolution of the seventh video and the resolution of the original video is less than the first threshold, and the frame rate of the third video is the same as that of the original video.
  • the frame rate difference is less than the second threshold.
  • the decoding unit when decomposing the compressed video into the first data stream and the second data stream, may decompose the compressed video into the first data stream and the second data stream according to the identifier in the compressed video.
  • the present application provides a decompression apparatus, which has the functions implemented by the decompression apparatus in the second aspect and any possible design of the second aspect.
  • the device function may be implemented by hardware, or by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the structure of the apparatus includes an acquisition unit, a decoding unit, and a restoration unit, and these units can perform the corresponding functions in the method example of the second aspect. For details, please refer to the detailed description in the method example, which will not be repeated here. .
  • the present application provides a compression device, which has the functions implemented by the compression device in the third aspect and any possible design of the third aspect.
  • the device function may be implemented by hardware, or by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the structure of the apparatus includes a first compression unit, a second compression unit, and a mixing unit, and these units can perform the corresponding functions in the method example of the third aspect. For details, please refer to the detailed description in the method example. No further elaboration here.
  • the present application further provides a computing device, and for beneficial effects, reference may be made to the description of the first aspect and any possible implementation manner of the first aspect and will not be repeated here.
  • the structure of the computing device includes a processor and a memory, and the processor is configured to perform the corresponding functions of the decompression apparatus or the compression apparatus in the first aspect and any possible implementation manner of the first aspect.
  • the memory is coupled with the processor, and stores necessary program instructions and data of the decompression device, the compression device or the video processing system.
  • the structure of the device also includes a communication interface for communicating with other devices.
  • the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, the computer is made to execute the first aspect and any one of the possible possibilities of the first aspect. method of implementation.
  • the present application further provides a computer program product comprising instructions, which, when run on a computer, cause the computer to execute the first aspect and the method for any possible implementation manner of the first aspect.
  • the present application further provides a computer chip, where the chip is connected to a memory, and the chip is used to read and execute a software program stored in the memory, and execute the first aspect and any one of the possible implementations of the first aspect. .
  • FIGS. 1A to 1C are schematic structural diagrams of a system provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a video transmission method provided by an embodiment of the present application.
  • 3A is a schematic flowchart of a method for extracting images provided by an embodiment of the present application.
  • 3B is a schematic flowchart of a downsampling method provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a method for mixing data streams according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a method for generating a compressed video according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of a method for decomposing a compressed video provided by an embodiment of the present application
  • FIG. 7 is a schematic diagram of a method for forming a new video by utilizing video 3 and video 4 according to an embodiment of the present application;
  • FIG. 8 is a schematic structural diagram of a reference-based super-resolution reconstruction module provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for processing compressed video provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a decompression apparatus provided by an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of a compression device provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a computing device according to an embodiment of the present application.
  • FIG. 1A it is a schematic diagram of the architecture of a video processing system to which the embodiments of the present application are applied, and the system includes a video sending system 100 and a video receiving system 200 .
  • the video sending system 100 can compress the original video to generate a compressed video.
  • the video sending system 100 can use two different compression methods for the original video, one compressing the original video.
  • the method is a compression method of downsampling the original video (for convenience of description, this compression method is the first compression method), and the other is a compression method of extracting images from the original video according to a preset time interval (for convenience of description, This compression method is the second compression method).
  • the time interval is a time interval determined based on the image playback time of the original video. For example, the value of the time interval can be equal to the interval between the playback times of two image frames in the original video when the original video is played.
  • the interval can be set to a value, and the embodiment of the present application does not limit the specific value of the time interval.
  • the video sending system 100 includes a client 110 and a video encoding system 120 , and the client 110 and the video encoding system 120 are connected through a network including a local area network, the Internet, or a wireless network.
  • the client 110 is used to generate the original video to be encoded, and the client 110 can be deployed in computing devices such as a user's terminal, personal computer, and tablet computer.
  • the video encoding system 120 includes one or more devices for compressing the original video, and the embodiment of the present application does not limit the deployment position and form of the one or more devices in the video encoding system 120 .
  • the device can be a hardware device, such as a server or a terminal computing device; it can also be a software device, specifically a software system running on the hardware computing device; the device can also be A device in a virtualized form such as a virtual machine.
  • the location where the device is deployed is not limited in the embodiments of the present application.
  • the apparatus can be deployed in a cloud computing device system (including at least one cloud computing device, such as a server, etc.), or can be deployed in an edge computing device system (including at least one edge computing device, such as: a server, a desktop computer, etc. ), can also be deployed on various terminal computing devices, such as: notebook computers, personal desktop computers, mobile phones, etc.
  • each device may perform the video encoding process in a distributed computing manner, or each device may perform one video encoding process.
  • the video encoding system 120 may include four devices, which are a video acquisition device, a first compression device, a second compression device, and a video transmission device.
  • the video acquisition device can be used to acquire the original video generated by the client 110.
  • the client 110 can be a device with a video shooting function (such as a video camera, a digital camera, a monitoring device, a mobile phone, and a tablet computer), which can perform video shooting.
  • the video is the original video.
  • the client 110 may also be a device installed with video editing software, and the video edited by the client may be used as the original video, and the client 110 may send the original video to the video acquisition device.
  • the video acquisition device After the video acquisition device acquires the original video, it may send the original video to the first compression device and the second compression device respectively.
  • the first compressing device compresses the original video by using the first compression method.
  • the second compressing device compresses the original video by using the second compression method.
  • the first compression method is a compression method for downsampling the original video. Further, the first compression method also includes an encoding operation, and the encoding operation can be performed after downsampling the original video, that is, after downsampling the original video. , and encode the down-sampled video to compress the original video. Optionally, the first compression method also includes a screening operation, and the screening operation can be performed after downsampling the original video and before performing the encoding operation, that is, downsampling the original video first, and then downsampling the video obtained after downsampling. Perform screening, and after screening, perform an encoding operation on the screened video.
  • the second compression method is a compression method for extracting images from the original video.
  • the first compression method also includes an encoding operation, and the encoding operation can be performed after extracting images from the original video, that is, after extracting images from the original video, you can The video formed after extracting the image is encoded to realize the compression of the original video.
  • the first compressing device sends the compressed data to the video sending device
  • the second compressing device sends the compressed data to the video sending device
  • the video sending device generates a compressed video according to the received data and sends the compressed video.
  • Multiple apparatuses in the video coding system 120 may be deployed in the same system or server, for example, the multiple apparatuses may be respectively deployed in the three environments of cloud computing equipment system, edge computing equipment system or terminal computing equipment, or may be deployed in three environments. in any two or one of these three environments.
  • the video encoding system 120 and the client 110 can also be deployed in one, that is to say, the video encoding system 120 is used to generate the original video, and is used to encode the original video, and the compressed The video is sent to the video receiving system 200 .
  • the video sending system 100 After the video sending system 100 generates the compressed video, it can send the compressed video to the video receiving system 200.
  • the method of data interaction between the video sending system 100 and the video receiving system 200 is not limited here.
  • the form of data interaction between the receiving systems 200 is related to the connection manner between the video sending system 100 and the video receiving system 200 .
  • the video sending system 100 and the video receiving system 200 can be connected by a wired cable such as an optical fiber
  • the video sending system 100 can send the compressed video to the video receiving system 200 through the wired cable.
  • the video sending system 100 and the video receiving system 200 can be connected through a wireless link such as Bluetooth and WIFI
  • the video sending system 100 can send compressed video to the video receiving system 200 through a wireless link such as Bluetooth and WIFI.
  • the video receiving system 200 can decompress the compressed video, and restore the compressed video to a third video that is close to or identical to the original video.
  • the process of restoring the compressed video by the video receiving system 200 is an "inverse" process of the process of compressing the original video by the video sending system 100 .
  • the video receiving system 200 After receiving the compressed video, the video receiving system 200 decodes the compressed video to obtain a first video and a second video, and then obtains a third video according to the first video and the second video. Decoding the compressed video to obtain the first video and the second video is related to the manner in which the first data stream and the second data stream are combined into a compressed video in the video transmission system 100 . After obtaining the first video and the second video, the video receiving system 200 may obtain the original video based on the first video and the second video, or a third video including the same image sequence as the original video.
  • the third video and the original video include the same sequence of images, which means some or all of the following: the number of frames (the number of images) in the third video is the same as the number of frames in the original video, and the images in the third video are the same as those in the original video.
  • the similarity of the images in the video is greater than the threshold, and the number of images whose similarity is greater than the threshold between the third video and the images in the original video is greater than the set value.
  • the specific values of the threshold value and the set value may be configured manually, or may be empirical values.
  • the video receiving system 200 includes a decompression system 210 and a client 220.
  • the decompression system 210 is used to decompress the compressed video sent by the video sending system 100, and send the processed video to the client 220. So that the user can view the restored video through the client terminal 220 .
  • the connection manner of the decompression system 210 and the client 220 is similar to the connection and deployment manner of the client 220 and the decompression system 210 in the video sending system 100 .
  • the decompression system 210 includes one or more devices for decompressing the compressed video, and the embodiment of the present application does not limit the deployment position and form of the one or more devices in the decompression system 210.
  • the deployment positions and shapes of the devices in the video decompression system 210 are similar to the deployment positions and shapes of the devices in the video coding system 120 , and details can be found in the foregoing content, which will not be repeated here.
  • each device may perform video decompression processing in a distributed computing manner, or each device may perform one video decompression process.
  • the decompression system 210 may include four devices, which are a video decomposition device, a decoding device, and a video reconstruction device, respectively.
  • the video decomposing device After receiving the compressed video, the video decomposing device can decompose the first data stream and the second data stream from the compressed video, and after obtaining the first data stream and the second data stream, the video decomposing device can decompose the first data stream and the second data stream.
  • the stream and the second data stream are sent to the decoding device, and the decoding device decodes the first data stream and the second data stream respectively to obtain the first video and the second video.
  • the decoding device sends the first video and the second video to the video reconstruction device.
  • the video reconstruction device has a reference-based super-resolution reconstruction capability, and can generate a third video similar to the original video by using the first video and the second video.
  • Multiple apparatuses in the decompression system 210 may be deployed in the same system or server, for example, the multiple apparatuses may be respectively deployed in the three environments of cloud computing equipment system, edge computing equipment system or terminal computing equipment, or may be deployed in three environments. in any two or one of these three environments.
  • Video, video resolution and video frame rate 1.
  • the video includes multiple frames of images, and the multiple frames of images are arranged according to the playback sequence to form an image sequence.
  • the resolution of the video refers to the resolution of the image in the video.
  • the resolution of the image can indicate the number of pixels in the image. Taking an image with a resolution of M*N as an example, the image includes M*N pixels. , where the number of pixels in the length direction is M, and the number of pixels in the width direction is N.
  • the frame rate of a video refers to the number of frames of images that can be displayed per unit of time when the video is played (one frame of image is one image).
  • video 1, video 2, video 3, video 4, and video 5 are all videos.
  • Video coding also referred to as coding
  • data stream data stream.
  • the two frames of images in the video whose playback order is close contain more similar information, and this similar information is redundant information.
  • the redundant information in the original video can be removed, and a video can be converted into file in another format.
  • a file in another format generated after the video is encoded is called a data stream, and the data stream includes multiple data segments.
  • Each frame of image in the video will generate a data segment in the data stream after video encoding (that is, the data segment in the data stream corresponds to the image in the video, and each frame of image will generate a corresponding data segment after video encoding) .
  • the process of restoring a data stream to a video is video decoding.
  • the processes of video encoding and video decoding are reciprocal.
  • each data segment in the data stream can be restored to a frame of image in the video. .
  • the data stream 1, the data stream 2, and the compressed video are all data streams.
  • a video processing method provided by an embodiment of the present application will be described below with reference to FIG. 2. Referring to FIG. 2, the method includes:
  • Step 201 The video sending system 100 acquires the original video.
  • the original video includes multiple frames of images, and in the original video, the multiple frames of images are sorted according to the video playback time to form an image sequence.
  • the present application does not limit the manner in which the video sending system 100 obtains the original video.
  • the original video may be a video shot by a client (eg, a video camera, a digital camera, or a monitoring device) with a video shooting function.
  • the original video may be a video processed by a client with a video processing function (eg, a computing device installed with video editing software).
  • the video sending system 100 After the video sending system 100 obtains the original video, it can use two compression methods to process the original video, one is the compression method of image extraction (the video 1 is obtained after image extraction) on the original video in the time dimension (see Steps 202 to 203), the other is a compression method that directly downsamples the original video (obtains video 2 after downsampling) (refer to the descriptions of steps 204 to 205).
  • Step 202 The video sending system 100 extracts images from the original video according to preset time intervals to generate video 1 .
  • Step 202 is image extraction in the time dimension, which is to reduce the number of images that can be played in the original video per unit time, that is, to reduce the frame rate of the original video.
  • the video sending system 100 may extract images from the original video at certain time intervals, and combine the extracted multiple frames of images into a video 1 .
  • the video transmission system 100 may extract a frame of images every 0.5 seconds.
  • the specific value of the time interval is not limited here, and the time interval can be determined according to the frame rate of the original video; for example, for the original video with a higher frame rate, a smaller time interval can be used, and for the original video with a lower frame rate , a longer time interval can be used.
  • the time interval can also be determined according to the content displayed by the original video; for example, for the original video with similar content displayed by each frame of images, a larger time interval can be used, and for the content displayed by each frame of images with a relatively large difference.
  • the original video you can use a smaller time interval.
  • the time interval is determined based on the playback time of the images in the original video.
  • the images that can be displayed in the time interval can be calculated by multiplying the time interval and the frame rate.
  • the number of all the images that can be displayed is P as an example.
  • the video sending system 100 can extract images from the original video at certain time intervals.
  • the process of the image can be understood as extracting one frame of images at every interval of P frames, and the extracted multiple frames of images form Video 1.
  • FIG. 3A it is a schematic diagram of image extraction from the original video in the time dimension.
  • the resolution of the image extracted from the original video can be maintained, and the frame rate (unit time) of the original video can be reduced. number of playable images), video 1 was obtained.
  • the frame number of the image in video 1 is lower than the frame number of the image in the original video, and each frame of image in video 1 is a frame of image in the original video, that is, the original video There is always an image in video 1 that is the same as a frame in video 1.
  • a frame image in video 1 corresponds to a frame image in the original video that is the same as the frame image.
  • Step 203 The video sending system 100 encodes the video 1 to obtain the data stream 1 .
  • the video sending system 100 can encode the video 1.
  • the encoding method of the video 1 is not limited here.
  • the video 1 can be encoded based on the general industry standard (H264 or H265).
  • the video transmission system 100 When the video transmission system 100 encodes the video 1, it encodes each frame of the image in the video 1, and the video transmission system 100 can encode a frame of the image in the video 1 to generate a data segment in the data stream 1,
  • the data segment generated by encoding each frame of video 1 constitutes data stream 1, and the ordering method of each frame of image in video 1 is consistent with the ordering method of data segments generated by encoding each frame of image in data stream 1.
  • video 1 can be generated by extracting images from the original video according to a certain time interval, and data stream 1 can be generated by encoding video 1, that is, data stream 1 is generated from the original video after image extraction and encoding.
  • Step 204 The video sending system 100 downsamples the original video to generate a video 2, where the video 2 includes multiple frames of images, and the multiple frames of images are images obtained by reducing the resolution of the multiple frames of images in the original video.
  • the down-sampling performed in step 202 is the down-sampling of each frame of image in the video, and the down-sampling of one frame of image can be understood as reducing the image according to the preset magnification, that is, reducing the resolution of the image .
  • image 1 For an image 1 with a resolution of M*N, where M refers to the number of pixels in the length direction of the image 1, and N refers to the number of pixels in the width direction of the image 1, perform the next step with a magnification of S on the image 1.
  • Sampling, image 2 with resolutions of M/S and N/S can be obtained, that is, the number of pixels in the length direction of image 2 is M/S, and the number of pixels in the width direction of image 2 is N/ S. That is, when downsampling the image 1 with a magnification of S, in the length direction, S pixels are reduced to 1 pixel, and in the width direction, S pixels are reduced to 1 pixel , which can be reduced by S times in both the length direction and the width.
  • Performing downsampling with a magnification of S on image 1 is to change the pixel matrix of S*S in image 1 into one pixel, and image 2 can be obtained.
  • the reduction factor of the length direction and the width method is the same, both of which are S as an example. In practical applications, different reduction factors can also be used in the length direction and the width method.
  • reduce S1 times in the length direction that is, reduce S1 pixels to 1 pixel in the length direction
  • reduce S2 times in the width direction that is, reduce S2 pixels in the length direction. Scale down to 1 pixel.
  • the downsampling with different multiples in the length direction and the width direction is to change the pixel point matrix of S1*S2 in the image 1 into one pixel point.
  • the present application does not limit the specific value of the magnification used for downsampling, and the magnification may be a fixed value, which is the default value of the video sending system 100 .
  • the magnification can also be a value determined according to the content presented in the original video. For example, when the similarity between adjacent image frames of the original video is relatively large, a larger magnification can be used; When the similarity of the frames is small, a smaller magnification can be used. In practical applications, different magnifications can be used for different images in the original video. For example, if the similarity of multiple image frames in the original video is relatively large and exceeds a certain threshold, a larger magnification can be used. smaller magnification.
  • This application does not limit the way of downsampling, including but not limited to: maximum sampling (that is, obtaining the largest pixel value in the S*S pixel point matrix), average sampling (that is, obtaining the S*S pixel point matrix) The average value of the pixel values in the middle), summed area sampling (that is, obtaining the sum of the pixel values in the S*S pixel point matrix) or random area sampling (that is, randomly obtaining a pixel in the S*S pixel point matrix pixel value), bilinear interpolation, bicubic interpolation, downsampling based on convolutional neural network.
  • FIG. 3B which is a schematic diagram of down-sampling the original video
  • the resolution of each frame image in the original video is reduced, and the frame rate of the original video (playable per unit time) is maintained. number of images), Video 2 was obtained.
  • the frame number of the image in video 2 is the same as the frame number of the image in the original video, that is, the frame rate remains unchanged, and one frame of image in video 2 is the same as that in the original video. image after resolution.
  • a frame of image in video 2 corresponds to an image in the original video before the resolution of the frame of image is reduced.
  • Step 205 The video sending system 100 encodes the video 2 to generate a data stream 2 .
  • the video sending system 100 can encode the video 2.
  • the method of encoding the video 2 is not limited here.
  • the video 2 can be encoded based on a general industry standard (H264 or H265).
  • the video transmission system 100 may directly encode video 2 to obtain data stream 2 .
  • the video sending system 100 can also filter the images in the video 2, encode the video 2 after the images have been screened, and obtain the data stream 2. .
  • the video sending device 100 may extract images from the video 2 according to a preset time interval (the time interval is the time interval involved in step 202). , that is, the video sending system 100 deletes a part of the image in the video 2, and the part image may include one or more frames of images, and the image corresponding to each frame of the image in the part of the image in the original video is in the original video.
  • the corresponding images in the video are the same.
  • the video transmission system 100 When the video transmission system 100 encodes the video 2, it encodes each frame of image in the video 2 (or the video 2 after the images are filtered).
  • the video sending system 100 can encode a frame of image in video 2 (or video 2 after filtering the image), and generate a data segment in the data stream 2, and each video 2 (or video 2 after filtering the image)
  • the data segment generated by frame image coding constitutes data stream 2, and the sorting method of each frame of image in video 2 (or video 2 after screening the image) is consistent with the sorting method of the data segment generated by each frame image coding in this data stream 2 .
  • downsampling the original video can generate video 2
  • encoding video 2 (or video 2 after filtering the images) can generate data stream 2
  • data stream 2 is the original video after downsampling (optional). , it can also be generated after image screening) and encoding.
  • Step 206 After obtaining the data stream 1 and the data stream 2, the video sending system 100 can mix the data stream 1 and the data stream 2 to generate a compressed video.
  • the data stream encoded by the data stream 1 to the video 1, that is, a data segment in the data stream 1 is generated after a frame image in the video 1 is encoded by the video, and the data segment corresponds to the image in the video 1. Since a frame of image in video 1 is the same as a frame of image in the original video, that is, a frame of image in video 1 corresponds to a frame of image in the original video, a data segment in data 1 can be understood as the original video. It is generated after encoding a frame image in the data stream 1, that is, a data segment in the data stream 1 has a corresponding relationship with a frame image of the original video.
  • the data stream 2 includes the data stream encoded on the video 2, that is, a data segment in the data stream 2 is generated after a frame image in the video 2 is encoded by the video, and the data segment is the same as the image in the video 2. is corresponding.
  • a frame of image in video 2 is an image in the original video with a reduced resolution, that is, a frame of image in video 2 corresponds to a frame of image in the original video, then a data segment in data 2 can be It is understood that a frame of image in the original video is generated by encoding after the resolution is reduced, that is, a data segment in the data stream 1 has a corresponding relationship with a frame of image of the original video.
  • the mixing of data stream 1 and data stream 2 refers to mixing the multiple data segments in the data stream 1 with the multiple data segments in the data stream 2 .
  • the embodiments of the present application do not limit the mixing manner of the data stream 1 and the data stream 2, and the video sending system 100 may place the data stream 1 after the data stream 2 to generate a compressed video.
  • the video transmission system 100 may also place the data stream 2 after the data stream 1 .
  • Video delivery system 100 may also embed data segments in data stream 1 into data stream 2 .
  • the video sending system 100 mixes the data stream 1 and the data stream 2, for each data segment in the data stream 1, the data segment can be embedded in the target position of the data stream 2, and the data stream
  • the image corresponding to the data segment in 1 and the image corresponding to the data segment closest to the target position in the video are the same or adjacent in the original video. That is, when playing the original video, the image corresponding to the data segment in data stream 1 and the image corresponding to the data segment closest to the target position in data stream 2 have the same playback time or the image corresponding to the data segment in data stream 1. , and the playback time of the image corresponding to the data segment closest to the target position in the data stream 2 is similar, that is, the difference of the playback time is within a preset range.
  • the embodiment of the present application does not limit the method for determining the target position.
  • data stream 1 is the encoded data stream of video 1
  • video 1 is generated by extracting images from the original video according to preset time intervals, while the original video
  • the process of extracting images according to a preset time interval may be the process of extracting images every P frames of images, according to the corresponding relationship between the image in video 1 and the data segment in data stream 1, and
  • a target position can be determined every P data segments in the data stream 2, and each time a target position is determined, a data segment in the data stream 1 is embedded in the target position.
  • the video sending system 100 may also add an identifier to the data of the compressed video, and the identifier may indicate the data belonging to the data stream 2 in the compressed video segment, it can also indicate the data segment belonging to data stream 1 in the compressed video.
  • the identifier there are many ways for the identifier to indicate the data segment belonging to the data stream 2 in the compressed video or to indicate the data segment belonging to the data stream 1 in the compressed video.
  • the embodiment of the present application does not limit the indication method of the identifier, nor does it limit the The specific form of the logo.
  • the identifier can be added after the last data segment of the data stream 1, and the data stream 2 and the data stream 1 in the compressed video can be identified by identifying the data stream 2 and the data stream 1.
  • the demarcation position of indicates that the data segment belonging to data stream 2 and the data segment belonging to data stream 1 are distinguished in the compressed video.
  • the data segment in the compressed video before the demarcation position is the data segment in the data stream 1
  • the data segment after the demarcation position is the data segment in the data stream 2.
  • the identification can also be information independent of the compressed video, and the video sending system 100 can send the compressed video to the The video receiving system 200 transmits the identification.
  • the identifier may also be added to one data segment or multiple data segments in the compressed video, such as being located at the head or end of one data segment or multiple data segments, as a component in the compressed video.
  • the video transmission system 100 may add the identifier to the header of the data segment located at the demarcation position, and the data segment may be the last data segment belonging to data stream 1 in the compressed video, or the data segment belonging to data stream 2 in the compressed video. The first data segment.
  • the compressed video in this embodiment of the present application may include multiple identifiers, and one identifier is located in a data segment of the compressed video, such as located inside the data segment, such as located at the head and tail of the data segment ;
  • the identifier is used to indicate that the data segment in which it is located belongs to data stream 2 or data stream 1.
  • the identification may be generated when the video transmission system 100 generates the data stream 2 and the data stream 1 .
  • the video sending system 100 may also add an identifier to the data segment each time a data segment is generated, and the identifier is used to indicate that the data belongs to the data stream 2.
  • each data segment in a compressed video may include an identifier, and the identifier is used to indicate that the data segment in which it is located belongs to data stream 2 or Data stream 1; for another example, in the compressed video, each data segment belonging to data stream 2 may include an identifier, and each data segment belonging to data stream 1 may not include an identifier, and the identifier is used to indicate the data segment where it is located belong to data stream 2; for another example, in the compressed video, each data segment belonging to data stream 1 may include an identifier, and each data segment belonging to data stream 2 may not include an identifier, and the identifier is used to indicate the data The segment belongs to stream 1.
  • the way in which the data segment where the identifier indicates belongs to data stream 2 or data stream 1 is not limited here.
  • a field is added to the header of the data segment. When the field is the first value, it indicates that the data segment belongs to data stream 1. When this field is the second value, it indicates that the data segment belongs to data stream 2.
  • a field is added to the header of the data segment, and the value P of this field can indicate the number of bytes of the data segment where it is located. , indicating that the data of the P bytes after the field is the data of the data segment.
  • the video processing method of the present application is further explained below in conjunction with specific examples.
  • the video sending system 100 can generate video 2 by downsampling, and extract the image by extracting the image. Generate video 1.
  • the video transmission system 100 then encodes the video 2 and the video 1 respectively, generates a data stream 2 and a data stream 1, and then generates a compressed video according to the data stream 2 and the data stream 1.
  • Step 207 the video sending system 100 sends the compressed video to the video receiving system 200 .
  • Step 208 after receiving the compressed video, the video receiving system 200 may first decompose the compressed video to generate video 3 and video 4 .
  • Steps 208 and 206 are inverse processes.
  • the video receiving system 200 can decompose the compressed video according to the method of generating the compressed video.
  • the video receiving system 200 can first decompose the data stream 1 and the data stream 2 from the compressed video. After that, data stream 1 and data stream 2 are decoded respectively to generate video 3 and video 4.
  • the video receiving system 200 When the video receiving system 200 decomposes the data stream 2 and the data stream 1 from the compressed video, it may decompose the compressed video into the data stream 2 and the data stream 1 according to the identifier.
  • the video receiving system 200 determines the boundary position between the data stream 2 and the data stream 1 in the compressed video according to the identifier, and then determines the data segment belonging to the data stream 2 and the data segment belonging to the data stream 2.
  • the data segment of stream 1. The video receiving system 200 intercepts the data segment belonging to the data stream 2 from the compressed video to obtain the data stream 2; the video receiving system 200 intercepts the data segment belonging to the data stream 1 from the compressed video to obtain the data stream 1.
  • each identifier is used to indicate that the data segment in which it is located belongs to data stream 1 or data stream 2, and the video receiving system 200 for each data segment in the compressed video
  • the identifier in the segment is parsed to determine the data segment belonging to the data stream 2 and the data segment belonging to the data stream 1, and then obtain the data stream 2 and the data stream 1.
  • the video receiving system 200 can decode the data stream 2 and the data stream 1 respectively.
  • the method for decoding the data stream 2 and the data stream 1 is not limited here.
  • the manner in which 2 and the data stream 1 are decoded corresponds to the manner in which the video sending system 100 encodes the video 2 and the video 1.
  • the data stream 2 and the data stream 1 can be decoded based on the general industry standard (H264 or H265).
  • FIG. 6 it is a schematic diagram for the video receiving system 200 to decompose the compressed video to obtain video 3 and video 4.
  • the video receiving system 200 can decompose the data segments in the compressed video to obtain data stream 2 and data stream 1, and then , respectively decode data stream 2 and data stream 1 to obtain video 4 and video 3.
  • video 1 is the video before data stream 1 is encoded
  • video 3 is the video decoded by data stream 1.
  • video 1 and video 3 are the same; however, there may be some information in the process of data encoding and decoding. Lost, there may be fewer differences between Video 1 and Video 3.
  • video 2 is the video before data stream 2 is encoded
  • video 4 is the video after data stream 2 is decoded.
  • video 2 and video 4 are the same; however, there may be some information loss in the process of data encoding and decoding. There may be fewer differences between Video 2 and Video 4.
  • video 3 is the decoded video of data stream 1
  • video 3 is similar or identical to video 1, that is, video 3 can be regarded as a video obtained by extracting images from the original video, and video 3 maintains the original video.
  • Video 4 is the decoded video of data stream 2.
  • Video 4 is similar to or the same as video 2, that is, video 4 can be regarded as the video obtained after the original video is down-sampled (optionally, including image filtering). 4 The frame rate of the original video is maintained.
  • Step 209 the video receiving system 200 generates video 5 according to video 3 and video 4 .
  • Video restoration can be performed based on video 3 and combined with video 4. Specifically, video 4 can be restored to a candidate video with a higher resolution, and the candidate video includes multiple candidate images. After that, according to video 3 and the candidate video Obtain a video with a resolution and frame rate close to the original video5.
  • the images in video 4 are divided into two categories, the first category is the images included in the video 3 with reduced resolution, and the second category is the images in the original video except the images included in the video 3 with reduced resolution Image.
  • the specific restoration process includes two parts, one part is for the restoration of the first type of images, and the other part is for the restoration of the second type of images.
  • the image in the video 3 can be directly used as the image after the resolution of this type of image has been increased, that is, the restored image.
  • a frame of image in video 5 is obtained by retaining the same image features and fusing the image features with differences.
  • single image super-resolution reconstruction refers to the technology of converting low-resolution images into higher-resolution images based on image analysis.
  • SISR includes but is not limited to super-resolution convolutional neural network (SRCNN), deep convolutional network (very deep convolutional networks, VDSR), single-image super-resolution enhanced deep residual network (enhanced deep residual networks) for single image super-resolution, EDSR).
  • SRCNN super-resolution convolutional neural network
  • VDSR very deep convolutional networks
  • EDSR single-image super-resolution enhanced deep residual network
  • the fusion of image features with differences can be understood as retaining the image features in the reference image and the candidate image of the frame with the weight of the feature.
  • the corresponding weight can be configured for the reference image and the candidate image, and the sum of the products of the weight and the image feature can be used. The value realizes the fusion of image features.
  • the reference image may be an image in video 3 whose similarity with the frame image in video 4 is greater than a threshold.
  • the reference image is adjacent to or consistent with the position of the frame image in the video 4 in the original video.
  • an image in video 3 that is temporally correlated with the frame image in video 4 is used as a reference image, that is, the interval between the reference image and the playback time of the image corresponding to the frame image in video 4 in the original video. close, within the preset range.
  • the trained modules can also be used.
  • the video receiving system 200 can be pre-deployed with a reference-based super-resolution reconstruction module, which is capable of reference-based super-resolution reconstruction (reference -based super-resolution, refSR) function, which can use the high-resolution video 3 as a reference to reconstruct the low-resolution video 4, and finally output the high-resolution video 5.
  • the reference-based super-resolution reconstruction module can use a frame of image in video 3 as a reference image to reconstruct a frame of image in video 4 to generate a frame of image in video 5.
  • the rate reconstruction module can use the image in video 3 as a reference image to reconstruct the image in video 4.
  • the video receiving system 200 may determine a frame of image from video 3 as a reference image for each frame of image in video 4, and the method for determining a reference image for each frame of image in video 4 by the video receiving system 200 is not limited here. description, which will not be repeated here.
  • the reference images corresponding to multiple frames of images in video 4 are allowed to be the same.
  • the reference image can be placed after the frame of image in video 4 to generate a new video.
  • the new video there is a frame of high-resolution image (the image belongs to the video 3) after each low-resolution image (the image belongs to the video 4), and the new video can be used as Input to the reference-based super-resolution reconstruction module.
  • the single-image super-resolution reconstruction module in the reference-based super-resolution reconstruction module performs single-image super-resolution reconstruction on the low-resolution image in the new video.
  • the candidate image is obtained by the resolution reconstruction, and the feature selection module is used to determine the different image features between the two images and the same image features between the two images from the reference image of the candidate image and the low-resolution image respectively.
  • the feature fusion module fuses the image features with differences, retains the same image features between the two images, and then obtains the image in video 5.
  • the embodiment of the present application does not limit the specific form of the reference-based super-resolution reconstruction module.
  • the reference-based super-resolution reconstruction module may be a neural network model, and the reference-based super-resolution reconstruction can be realized through pre-training and learning.
  • the reference-based super-resolution reconstruction module can also be implemented according to the refSR calculation library based on the signal processing mechanism behind the reference-based super-resolution reconstruction.
  • FIG. 9 it is the processing method of the compressed video by the video receiving system 200 in the embodiment of the application.
  • the receiving system 200 decomposes the data stream 1 and the data stream 2, and then performs video decoding on the data stream 1 and the data stream 2 to obtain the video 3 and the video 4 respectively.
  • the video 3 and the video 4 can be integrated to form a new video.
  • the new video is input to the reference-based super-resolution reconstruction module, the obtained video 5 is output.
  • the obtained video 5 can not only ensure that the resolution of the images in the video 5 is consistent with the original video, but also can ensure that the frame rate in the video 5 is consistent with the frame rate of the original video or has a small difference, so that the video 5 is closer to the original video.
  • the video sending system 100 can compress the original video in two different ways, so that the generated compressed video can carry more information related to the original video.
  • the video receiving system 200 receives the compressed video, it decodes the compressed video to obtain two different videos, and then uses the generated two different videos to restore the resolution and frame rate with the original video.
  • This video processing method can also ensure that the video that is close to the original video can be restored on the side of the video receiving system 200 under the condition that the compressed video is small, the video restoration degree is higher, and the method is simpler and more efficient.
  • the interaction between the video sending system 100 and the video receiving system 200 is used as an example for description. It can exist independently, that is, a processing device can use the method performed by the video sending system 100 in the embodiment shown in FIG. 2 to obtain a compressed video by compressing the original video, and save the compressed video to reduce the space required for saving the original video. , that is, the original video is saved by saving the compressed video.
  • the processing device can use the method performed by the video receiving system 200 in the embodiment shown in FIG. 2 to obtain a video 5 that is close to the original video by decompressing the compressed video.
  • the method is simple and efficient, And the video restoration degree is high.
  • the embodiments of the present application further provide a decompression apparatus 1000, where the decompression apparatus 1000 is configured to execute the method executed by the video receiving system in the above method embodiments.
  • the decompression apparatus 1000 includes an acquisition unit 1001 , a decoding unit 1002 and a restoration unit 1003 . Specifically, in the decompression device, each unit is connected through a communication path.
  • the obtaining unit 1001 is configured to obtain a compressed video to be processed, where the compressed video includes a plurality of data segments obtained by adopting the first compression mode and the second compression mode respectively for the original video.
  • the decoding unit 1002 is configured to decode the compressed video to obtain a first video and a second video, where the first video includes the result obtained by using the first compression method in decoding the compressed video; the second video includes decoding the compressed video using the second compression method Get the result of the data segment.
  • the restoration unit 1003 is configured to determine a third video according to the first video and the second video, where the image sequence of the third video is the same as that of the original video.
  • each unit in the decompression apparatus 1000 in this embodiment of the present application may be implemented by an application-specific integrated circuit (ASIC), or a programmable logic device (PLD). It can be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the decompression apparatus 1000 and each unit in the decompression apparatus 1000 may be software modules.
  • the decoding unit 1002 may decompose the compressed video into the first data stream and the second data stream; after that, decode the first data stream , generate the first video; decode the second data stream to generate the second video.
  • the restoring unit 1003 may obtain a seventh video according to the second video, and the resolution of the images in the seventh video is higher than that of the second video.
  • the resolution of the image after that, a third video is generated according to the first video and the seventh video, the difference between the resolution of the seventh video and the resolution of the original video is less than the first threshold, and the frame rate of the third video is the same as that of the original video.
  • the frame rate difference is less than the second threshold.
  • the compressed video further includes an identifier, and the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • the decoding unit 1002 may decompose the compressed video into the first data stream and the second data stream according to the identifier in the compressed video.
  • the decompression apparatus 1000 provided by this embodiment of the present application may correspond to executing the method performed by the video receiving system in the embodiment described in FIG. 2 of the present application, and the above and other operations and/or the above and other operations of each unit in the decompression apparatus 1000
  • the functions are respectively in order to implement the corresponding flow of the method in steps 208 to 209 in FIG. 2 .
  • the decompression device 1000 decodes the compressed video to generate two different videos, and then restores the difference in resolution and frame rate from the original video based on the two different videos
  • the decompression device 1000 can parse the compressed video, and the obtained two different videos carry the relevant information of the original video to different degrees, so that a video that is closer to the original video can be finally restored, and the quality of the video is restored. obvious improvement.
  • the embodiment of the present application further provides a compression apparatus 1100, and the decompression apparatus 1100 is configured to execute the method executed by the video sending system in the above method embodiment.
  • the compression device 1100 includes a first compression unit 1101 , a second compression unit 1102 and a mixing unit 1103 .
  • each module is connected through a communication path.
  • the first compression unit 1101 is configured to compress the original video by adopting a first compression method, and the first compression method is a compression method of down-sampling the original video.
  • the second compression unit 1102 is configured to compress the original video by adopting a second compression method, where the second compression method is a compression method in which images are extracted from the original video according to a preset time interval.
  • the mixing unit 1103 is configured to obtain a compressed video according to the data compressed by the first compression unit 1101 and the second compression unit 1102 on the original video.
  • each unit in the compression apparatus 1100 in the embodiment of the present application may be implemented by an application-specific integrated circuit (ASIC), or a programmable logic device (PLD), and the above PLD may be It is a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the compression apparatus 1100 and each unit of the compression apparatus 1100 may be software modules.
  • the first compression unit 1101 may compress the original video in a first compression manner to generate a first data stream, where the first data stream includes at least one data segment.
  • the second compression unit 1102 may compress the original video in the second compression manner to generate a second data stream, where the second data stream includes at least one data segment.
  • the mixing unit 1103 may obtain the compressed video according to the first data stream and the second data stream.
  • the first compression unit 1101 when the first compression unit 1101 compresses the original video in the first compression mode to generate the first data stream, it can downsample the images in the original video to generate the fourth video; The images in the video are screened to obtain the fifth video; the fifth video is encoded to generate the first data stream.
  • the second compression unit 1102 when the second compression unit 1102 compresses the original video using the second compression method to generate the second data stream, it can extract images from the original video according to preset time intervals to generate the sixth video; The sixth video is encoded to generate a second data stream.
  • the compressed video further includes an identifier, and the identifier is used to indicate a data segment belonging to the first data stream or a data segment belonging to the second data stream in the compressed video.
  • the compression apparatus 1100 provided in this embodiment of the present application may correspond to executing the method performed by the video sending system in the embodiment described in FIG. 2 of the present application, and the above and other operations and/or functions of each unit in the compression apparatus 1100 are for the purpose of To implement the corresponding flow of the method in steps 201 to 207 in FIG. 2 , for the sake of brevity, reference may be made to the descriptions in the foregoing method embodiments, and details are not repeated here.
  • the compression apparatus can compress the original video in two different ways, so that the generated compressed video includes both a high-resolution data stream and a data stream with the same frame rate as the original video, so that the decompression
  • the device can restore the original video respectively according to the high-resolution data stream and the data stream with the same frame rate. It is convenient for the receiving end of the compressed video to restore a video that is closer to the original video based on this.
  • the compression method adopted by the compression device is also relatively simple.
  • the embodiments of the present application further provide a video processing apparatus, and the data processing apparatus is configured to execute the methods performed by the video receiving system and the video sending system in the above method embodiments.
  • the video processing apparatus may include each unit in the decompression apparatus and each unit in the compression apparatus in the foregoing description, and the functions of each unit in the decompression apparatus and each unit in the compression apparatus can refer to the foregoing content, and will not be repeated here. .
  • the division of modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be other division methods.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit. In the device, it can also exist physically alone, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules.
  • the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions to make a terminal device (which may be a personal computer, a mobile phone, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method in each embodiment of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .
  • the present application also provides a computing device 1200 as shown in FIG. 12 .
  • the computing device 1200 includes a bus 1201 , a processor 1202 , a communication interface 1203 and a memory 1204 . Communication between the processor 1202 , the memory 1204 and the communication interface 1203 is through the bus 1201 .
  • the processor 1202 may be a central processing unit (central processing unit, CPU).
  • the memory 1204 may include volatile memory, such as random access memory (RAM).
  • the memory 1204 may also include non-volatile memory, such as read-only memory (ROM), flash memory, HDD, or SSD.
  • Executable code is stored in the memory, and the processor 1202 executes the executable code to perform the method described above in FIG. 2 .
  • the memory 1204 may also include other software modules (eg, multiple units in the decompression apparatus 1000 or multiple units in the compression apparatus 1100 ) required for running processes such as an operating system.
  • the operating system can be LINUX TM , UNIX TM , WINDOWS TM and so on.
  • FIG. 12 a plurality of units including the decompression apparatus 1000 in the memory 1204 are only exemplarily drawn.
  • the processor 1202 may invoke the software module in the memory 1204 to execute the method executed by the video receiving system 200 in the above method embodiments.
  • the processor 1202 may invoke the software module in the memory 1204 to execute the method executed by the video sending system 100 in the above method embodiments.
  • the processor 1202 may invoke the software modules in the memory 1204 to execute the methods performed by the video receiving system 200 and the video sending system 100 in the above method embodiments .
  • the present application further provides a computing device system, where the computing device system includes at least two computing devices 1200 as shown in FIG. 12 .
  • Communication between any two computing devices 1200 is through a communication network, wherein one computing device runs the compression apparatus 1000, the other device runs the compression apparatus 1100, and the two computing devices are respectively used to execute the video sending system 100 in the above method embodiment. Or the operation steps of the corresponding subject in the method executed by the video receiving system 200 .
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes computer program instructions, and when the computer program instructions are loaded and executed on a computer, the process or function described in FIG. 2 according to the embodiment of the present invention is generated in whole or in part.
  • the present application further provides a computing device system, where the computing device system includes two or more virtual machines, containers, and other computing devices in a virtualized form, each virtual machine or container is used to implement the video transmission in the above method embodiment, respectively. Operation steps of the corresponding subject in the method performed by the system 100 or the video receiving system 200 .
  • the virtual machine or container runs in the computing device of the computing device system, and the structure of the computing device can be seen in FIG. 12 .
  • the sending side of the compressed video can compress the original video in two different ways to obtain the compressed video, which can effectively reduce the size of the compressed video, and also enables the compressed video to carry relatively large amounts of data.
  • the receiving side of the compressed video receives the compressed video, it can decode the compressed video to obtain two different videos, and then use the two different videos to restore the original video resolution and
  • the video with small frame rate difference has a high degree of video restoration, which can effectively improve the video processing efficiency.
  • the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media.
  • the semiconductor medium may be a solid state drive (SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供一种视频处理方法、装置以及设备,压缩装置可以通过对原视频分别采用第一压缩方式和第二压缩方式获得压缩视频,解压缩装置先获取了压缩视频后,可以对该压缩视频进行解码,获得第一视频和第二视频,其中,第一视频包括解码压缩视频中采用第一压缩方式获得数据段的结果;第二视频包括解码压缩视频中采用第二压缩方式获得数据段的结果;再根据第一视频和第二视频生成与原视频相近的第三视频。压缩装置分别采用两种不同的压缩方式对原视频进行压缩,进而获得包括高分辨率的数据和帧率不变的数据的压缩视频,相应地,解压缩装置通过对该压缩视频的解压缩,能够还原出与原视频更加接近的高分辨率和帧率不变的第三视频,以此提高视频传输的质量。

Description

一种视频处理方法、装置以及设备
本申请要求于2020年12月27日提交中国知识产权局、申请号为202011492788.4、申请名称为“一种视频处理方法、装置以及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,尤其涉及一种视频处理方法、装置以及设备。
背景技术
近年来,随着各种视频应用的广泛使用,视频处理的方法也逐渐成为研究的热点。视频处理包括视频编码,视频编码可以利用图像的特性,消除视频中图像存在的数据冗余,达到视频压缩的效果,减少视频所占用的空间,编码后的视频的大小得到明显缩减,能够减少视频在传输过程中所需占用的传输带宽。进一步地,在视频编码的过程中,还可以通过码率控制的方式减少视频的大小。基于感兴趣区域的码率控制是其中较为常见的一种,该方式中根据人眼对图像中不同区域的敏感度不同,对视频中的图像不同的区域配置不同的压缩比例,例如对于人眼敏感度较高的区域(也即感兴趣区域)配置较小的压缩比例,尽量完整的保证该区域的图像,对于人眼敏感度较低的区域(也即非感兴趣区域)配置较大的压缩比例,尽量减少该区域的图像在编码后所占用的空间。
但是,对于同一个图像的不同区域采用不同的压缩比例获得压缩视频,丢掉了原视频的较多信息,导致视频接收端在接收到该压缩视频后,对压缩视频解码还原时,还原出的视频与原视频存在较大的差异,还原视频的质量低。因此,如何提供一种高质量的视频处理方法成为亟待解决的技术问题。
发明内容
本申请提供一种视频处理方法、装置以及设备,用以提供一种高质量的视频处理方法。
第一方面,本申请实施例提供了一种视频处理方法,该视频处理方法中由视频处理系统执行,视频处理系统先获取待处理的压缩视频,该压缩视频包括视频处理系统通过对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段。在获得了压缩视频后,视频处理系统可以对该压缩视频进行解码,分解出两个视频,分别为第一视频和第二视频。第一视频包括解码该压缩视频中采用第一压缩方式获得数据段的结果;第二视频包括解码该压缩视频中采用第二压缩方式获得数据段的结果;在获得了第一视频和第二视频后,视频处理系统可以根据第一视频和第二视频进行视频还原,生成与原视频相近的第三视频,第三视频与原视频的图像序列相同。
通过上述方法,视频处理系统可以采用两种不同的压缩方式对原视频进行压缩,以获得包括高分辨率和与原视频的帧率相同的数据的压缩视频;解压缩时,则可以通过对该压缩视频进行解压缩,分别获得高分辨率的视频和帧率不变的视频,进而根据上述两个视频还原出与原视频更加接近的第三视频,视频还原程度更高。
在一种可能的实现方式中,在视频处理系统获取待处理的压缩视频之前,视频处理系统可以先对原视频进行压缩,获得该压缩视频。具体的,视频处理系统可以分别采用第一压缩方式和第二压缩方式对原视频进行压缩,获得压缩视频,第一压缩方式为对原视频下采样的压缩方式;第二压缩方式为按照预设的时间间隔从原视频抽取图像的压缩方式。
通过上述方法,两种不同压缩方式结合的方式获得的压缩视频中能够包括高分辨率和帧率不变的视频,便于视频处理系统利用压缩视频还原出第三视频。
在一种可能的实现方式中,视频处理系统在对原视频分别采用第一压缩方式和第二压缩方式获得压缩视频时,可以分别采用两种压缩方式获得两个包括多个数据段的数据流,其中,采用第一压缩方式压缩原视频获得的数据流为第一数据流,采用第二压缩方式压缩原视频获得的数据流为第二数据流。之后,视频处理系统可以根据第一数据流和第二数据流获得压缩视频。
通过上述方法,视频处理系统可以采用不同的压缩方式获得两个数据流,通过混合该两个数据流获得压缩视频,使得压缩视频中携带与原视频更接近的数据,压缩视频的生成方式也更加便捷,能够提升对原视频的压缩效率。
在一种可能的实现方式中,视频处理系统在采用第一压缩方式压缩原视频,生成第一数据流时,可以先对原视频中的图像进行下采样,生成第四视频,第四视频为与原视频帧率相同的视频,对第四视频进行编码生成第一数据流。视频处理系统也可以在对原视频中的图像进行下采样生成第四视频后,对第四视频中的图像进行筛选,获得第五视频;再对第五视频进行编码,生成第一数据流。
通过上述方法,视频处理系统可以下采样的方式生成第一数据流,保证获得帧率不变的视频数据。
在一种可能的实现方式中,视频处理系统在采用第二压缩方式压缩原视频,生成第二数据流时,可以按照预设的时间间隔从原视频中抽取图像,将抽取的图像构成第六视频,该第六视频为与原视频分辨率相同的视频;在获得第六视频后,可以对第六视频进行编码,生成第二数据流。
通过上述方法,视频处理系统通过抽取图像、编码可以较为快速的获得第二数据流,这种方式效率较高,能够有效提升视频压缩效率。
在一种可能的实现方式中,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
通过上述方法,压缩视频中通过携带标识,能够直接指示出该压缩视频中属于第一数据流的数据段以及属于第二数据流的数据段,便于视频处理系统利用该标识分解该压缩视频。
在一种可能的实现方式中,视频处理系统在对该压缩视频进行解码时,可以将压缩视频分解为第一数据流和第二数据流;之后,分别为第一数据流和第二数据流进行解码,获 得第一视频和第二视频。
通过上述方法,视频处理系统可以从该压缩视频中分解出两个不同的数据流,通过对该两个数据流进行简单的解码获得两个视频。
在一种可能的实现方式中,视频处理系统在根据第一视频和第二视频确定第三视频时,可以根据第二视频获得分辨率较高的第七视频,第七视频中图像的分辨率高于第二视频的图像的分辨率,这里并不限定通过第二视频获得第七视频的方式,例如,可以通过对第二视频中的多帧图像进行单幅图像超分辨率重建获得多帧高分辨率的图像,该多帧高分辨率图像构成第七视频。在获得第七视频之后,视频处理系统可以根据第一视频和第七视频生成与原视频较为接近的第三视频,第三视频的分辨率和原视频的分辨率的差异小于第一阈值,第三视频的帧率和原视频的帧率差异小于第二阈值。
通过上述方法,视频处理系统能够获得与原视频差异较小的第三视频,视频还原程度高。
第二方面,本申请实施例提供了一种视频处理方法,有益效果可参见第一方面的说明,此处不再赘述。该方法由解压缩装置执行,该方法中,解压缩装置先获取待处理的压缩视频,压缩视频包括对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段;解压缩装置可以解码压缩视频获得第一视频和第二视频,第一视频包括解码该压缩视频中采用第一压缩方式获得数据段的结果;第二视频包括解码该压缩视频中采用第二压缩方式获得数据段的结果;在获得了第一视频和第二视频之后,解压缩装置可以根据第一视频和第二视频确定第三视频,第三视频与原视频的图像序列相同。
在一种可能的实现方式中,解压缩装置在解码压缩视频获得第一视频和第二视频时,可以将压缩视频分解为第一数据流和第二数据流;之后,对第一数据流进行解码,生成第一视频;对第二数据流进行解码,生成第二视频。
在一种可能的实现方式中,解压缩装置在根据第一视频和第二视频确定第三视频时,可以根据第二视频获得第七视频,第七视频中图像的分辨率高于第二视频的图像的分辨率;然后,再根据第一视频和第七视频生成第三视频,第七视频的分辨率和原视频的分辨率的差异小于第一阈值,第三视频的帧率和原视频的帧率差异小于第二阈值。
在一种可能的实现方式中,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
在一种可能的实现方式中,解压缩装置在将压缩视频分解为第一数据流和第二数据流时,可以根据压缩视频中的标识将压缩视频分解为第一数据流和第二数据流。
第三方面,本申请实施例提供了一种视频处理方法,有益效果可参见第一方面的说明,此处不再赘述。该方法由压缩装置执行,该方法中,压缩装置可以分别采用第一压缩方式以及第二压缩方式对原视频进行压缩,获得压缩视频,其中,第一压缩方式为对原视频下采样的压缩方式,第二压缩方式为按照预设的时间间隔从原视频抽取图像的压缩方式。
在一种可能的实现方式中,压缩装置采用第一压缩方式压缩原视频获得第一数据流,第一数据流包括至少一个数据段;压缩装置采用第二压缩方式压缩原视频获得第二数据流,第二数据流包括至少一个数据段;在获得了第一数据流和第二数据流之后,压缩装置可以根据第一数据流和第二数据流获得压缩视频。
在一种可能的实现方式中,压缩装置在采用第一压缩方式压缩原视频,生成第一数据流时,可以对原视频中的图像进行下采样,生成第四视频;之后,再对第四视频中的图像进行筛选,获得第五视频;对第五视频进行编码,生成第一数据流。
在一种可能的实现方式中,压缩装置在采用第二压缩方式压缩原视频,生成第二数据流时,可以按照预设的时间间隔从原视频中抽取图像,生成第六视频;对第六视频进行编码,生成第二数据流。
在一种可能的实现方式中,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
第四方面,本申请实施例提供了一种视频处理装置,有益效果可参见第一方面的说明,此处不再赘述。该视频处理装置包括获取单元、解码单元以及还原单元:
获取单元,用于获取待处理的压缩视频,压缩视频包括对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段。
解码单元,用于解码压缩视频获得第一视频和第二视频,第一视频包括解码压缩视频中采用第一压缩方式获得数据段的结果;第二视频包括解码压缩视频中采用第二压缩方式获得数据段的结果。
还原单元,用于根据第一视频和第二视频确定第三视频,第三视频与原视频的图像序列相同。
在一种可能的实现方式中,装置还包括第一压缩单元、第二压缩单元以及混合单元。
第一压缩单元,用于采用第一压缩方式对原视频进行压缩,第一压缩方式为对原视频下采样的压缩方式。
第二压缩单元,用于采用第二压缩方式对原视频进行压缩,第二压缩方式为按照预设的时间间隔从原视频抽取图像的压缩方式。
混合单元,用于根据第一压缩单元和第二压缩单元对原视频进行压缩后的数据,获取压缩视频。
在一种可能的实现方式中,第一压缩单元采用第一压缩方式压缩原视频获得的数据为第一数据流,第一数据流包括至少一个数据段;第二压缩单元采用第二压缩方式压缩原视频获得的数据为第二数据流,第二数据流包括至少一个数据段;混合单元可以根据第一数据流和第二数据流获得压缩视频。
在一种可能的实现方式中,第一压缩单元在采用第一压缩方式压缩原视频,生成第一数据流时,可以先对原视频中的图像进行下采样,生成第四视频;之后,对第四视频中的图像进行筛选,获得第五视频;之后,再对第五视频进行编码,生成第一数据流。
在一种可能的实现方式中,第二压缩单元在采用第二压缩方式压缩原视频,生成第二数据流时,可以按照预设的时间间隔从原视频中抽取图像,将抽取的图像组成第六视频;之后,第二压缩单元对第六视频进行编码,生成第二数据流。
在一种可能的实现方式中,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
在一种可能的实现方式中,解码单元在解码压缩视频获得第一视频和第二视频时,可以将压缩视频分解为第一数据流和第二数据流;分别对第一数据流和第二数据流进行解码, 获得第一视频和第二视频。
在一种可能的实现方式中,还原单元在根据第一视频和第二视频确定第三视频时,可以根据第二视频获得第七视频,第七视频中图像的分辨率高于第二视频的图像的分辨率;之后,再根据第一视频和第七视频生成第三视频,第七视频的分辨率和原视频的分辨率的差异小于第一阈值,第三视频的帧率和原视频的帧率差异小于第二阈值。
在一种可能的实现方式中,解码单元在将压缩视频分解为第一数据流和第二数据流时,可以根据压缩视频中的标识将压缩视频分解为第一数据流和第二数据流。
第五方面,本申请提供了一种解压缩装置,该装置具有实现第二方面及第二方面任意一种可能的设计中解压缩装置所实现的功能。该设备功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,装置的结构中包括获取单元、解码单元以及还原单元,这些单元可以执行上述第二方面方法示例中的相应功能,具体参见方法示例中的详细描述,此处不做赘述。
第六方面,本申请提供了一种压缩装置,该装置具有实现第三方面及第三方面任意一种可能的设计中压缩装置所实现的功能。该设备功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,装置的结构中包括第一压缩单元、第二压缩单元以及混合单元,这些单元可以执行上述第三方面方法示例中的相应功能,具体参见方法示例中的详细描述,此处不做赘述。
第七方面,本申请还提供了一种计算设备,有益效果可以参见第一方面及第一方面任意一种可能的实现方式的描述此处不再赘述。计算设备的结构中包括处理器和存储器,处理器被配置为执行上述第一方面及第一方面任意一种可能的实现方式中解压缩装置或压缩装置的相应的功能。存储器与处理器耦合,其保存解压缩装置、压缩装置或视频处理系统必要的程序指令和数据,装置的结构中还包括通信接口,用于与其他设备进行通信。
第八方面,本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面及第一方面任意一种可能的实现方式的方法。
第九方面,本申请还提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面及第一方面任意一种可能的实现方式的方法。
第十方面,本申请还提供一种计算机芯片,芯片与存储器相连,芯片用于读取并执行存储器中存储的软件程序,执行上述第一方面及第一方面任意一种可能的实现方式的方法。
附图说明
图1A~图1C为本申请实施例提供的一种系统的架构示意图;
图2为本申请实施例提供的一种视频传输方法的流程示意图;
图3A为本申请实施例提供的一种抽取图像方法的流程示意图;
图3B为本申请实施例提供的一种下采样方法的流程示意图;
图4为本申请实施例提供的一种混合数据流的方法的流程示意图;
图5为本申请实施例提供的一种生成压缩视频的方法的流程示意图;
图6为本申请实施例提供的一种分解压缩视频的方法的流程示意图;
图7为本申请实施例提供的一种利用视频3和视频4组建新视频的方法示意图;
图8为本申请实施例提供的一种基于参考的超分辨率重建模块的结构示意图;
图9为本申请实施例提供的一种对压缩视频进行处理的方法的流程示意图;
图10为本申请实施例提供的一种解压缩装置的结构示意图;
图11为本申请实施例提供的一种压缩装置的结构示意图;
图12为本申请实施例提供的一种计算设备的结构示意图。
具体实施方式
如图1A所示,为本申请实施例适用的一种视频处理系统架构示意图,该系统中包括视频发送系统100和视频接收系统200。
视频发送系统100能够对原视频进行压缩,生成压缩视频,在本申请实施例中视频发送系统100在对原视频进行压缩时,可以对该原视频分别采用两种不同的压缩方式,一种压缩方式为对原视频下采样的压缩方式(为方便说明,该种压缩方式为第一压缩方式),另一种为按照预设的时间间隔从原视频中抽取图像的压缩方式(为方便说明,该种压缩方式为第二压缩方式)。其中,该时间间隔是以原视频的图像播放时间为基础确定的时间间隔,例如,该时间间隔的值可以等于在播放原视频时,原视频中两帧图像帧的播放时间的间隔,该时间间隔可以设定值,本申请实施例并不限定该时间间隔具体值。
如图1B所示,视频发送系统100中包括客户端110和视频编码系统120,客户端110和视频编码系统120通过网络相连,该网络包括局域网、互联网或无线网络等形式的连接。客户端110用于生成待编码的原视频,该客户端110可以部署在用户的终端、个人电脑、平板电脑等计算设备中。视频编码系统120包括一个或多个装置,用于实现对原视频的压缩,本申请实施例并不限定该视频编码系统120中的一个或多个装置的部署位置以及形态。以其中任一装置为例,该装置可以是一个硬件装置,例如:服务器、终端计算设备;也可以是一个软件装置,具体为运行在硬件计算设备上的一套软件系统;该装置还可以是虚拟机等虚拟化形式的设备。此外,本申请实施例中并不限定该装置所部署的位置。示例性的,该装置可以部署在云计算设备系统(包括至少一个云计算设备,例如:服务器等),也可以部署在边缘计算设备系统(包括至少一个边缘计算设备,例如:服务器、台式电脑等),也可以部署在各种终端计算设备上,例如:笔记本电脑、个人台式电脑、手机等。
当视频编码系统120中包括多个装置时,各个装置可以采用分布式计算方式执行视频编码处理,也可以每个装置各自执行一个视频编码处理。举例来说,该视频编码系统120可以包括四个装置,分别为视频获取装置、第一压缩装置、第二压缩装置以及视频发送装置。
视频获取装置可以用于获取客户端110生成的原视频,该客户端110可以具备视频拍摄功能的装置(如摄像器、数码相机、监控装置、手机、平板电脑),能够进行视频拍摄,拍摄后的视频为原视频。该客户端110也可以为安装有视频剪辑软件的装置,经过客户端进行视频剪辑后的视频可以作为原视频,客户端110可以将该原视频发送给视频获取装置。
视频获取装置获取原视频后,可以分别向第一压缩装置和第二压缩装置发送原视频。
第一压缩装置在接收到该原视频之后,采用第一压缩方式对原视频进行压缩。第二压缩装置在接收到该原视频之后,采用第二压缩方式对原视频进行压缩。
第一压缩方式为对原视频进行下采样的压缩方式,进一步的,第一压缩方式中还包括编码操作,该编码操作可以在对原视频进行下采样后执行,也即对原视频下采样后,对下采样后的视频进行编码实现对原视频的压缩。可选的,第一压缩方式中还包括筛选操作,该筛选操作可以在对原视频进行下采样之后,执行编码操作之前执行,也即先对原视频进行下采样,对下采样后获得的视频进行筛选,筛选之后,对筛选之后的视频进行编码操作。
第二压缩方式为从原视频抽取图像的压缩方式,进一步的,第一压缩方式中还包括编码操作,该编码操作可以在从原视频抽取图像后执行,也即从原视频抽取图像后,可以对抽取图像后形成的视频进行编码,实现对原视频的压缩。
第一压缩装置将压缩后生成的数据发送给视频发送装置,第二压缩装置将压缩后生成的数据发送给视频发送装置,视频发送装置根据接收到的数据生成压缩视频,并发送该压缩视频。
视频编码系统120中的多个装置可以部署在相同的系统或服务器中,如该多个装置可以分别部署在云计算设备系统、边缘计算设备系统或终端计算设备这三个环境中,也可以部署在这三个环境中的任意两个或一个中。
作为一种可能的实施例,视频编码系统120和客户端110也可以合一部署,也就是说视频编码系统120即用于生成原视频,又用于对原视频进行编码,并将压缩后的视频发送给视频接收系统200。
视频发送系统100在生成压缩视频后,可以向视频接收系统200发送该压缩视频,这里并不限定该视频发送系统100与视频接收系统200之间进行数据交互的方式,该视频发送系统100与视频接收系统200之间数据交互的形式与视频发送系统100与视频接收系统200之间的连接方式有关。例如,如果视频发送系统100与视频接收系统200之间可以通过光纤等有线线缆连接,则视频发送系统100可以通过该有线线缆向视频接收系统200发送压缩视频。又例如,如果视频发送系统100与视频接收系统200之间可以通过蓝牙、WIFI等无线链路连接,则视频发送系统100可以通过蓝牙、WIFI等无线链路向视频接收系统200发送压缩视频。
视频接收系统200在接收到该压缩视频后,可以对该压缩视频进行解压缩,将该压缩视频还原为与原视频接近或相同的第三视频。视频接收系统200对该压缩视频进行还原的过程,是视频发送系统100对原视频进行压缩的过程的“逆”过程。
视频接收系统200在接收到该压缩视频之后,解码该压缩视频获得第一视频和第二视频,之后,根据第一视频和第二视频获得第三视频。解码该压缩视频获得第一视频和第二视频与在视频发送系统100中第一数据流和第二数据流合并为压缩视频的方式相关。视频接收系统200在获得该第一视频和第二视频后,可以基于第一视频和第二视频获取原视频,或与原视频包括相同图像序列的第三视频。
这里第三视频与原视频包括相同图像序列是指下列的部分或全部:该第三视频中的帧数(图像的数量)与原视频中的帧数相同,该第三视频中的图像与原视频中的图像的相似度大于阈值相同,该第三视频与原视频中的图像的相似度大于阈值的图像数量大于设定值。 其中阈值与设定值的具体值可以为人为配置的,也可以为经验值。
如图1C所示,视频接收系统200中包括解压缩系统210和客户端220,解压缩系统210用于解压缩视频发送系统100发送的压缩视频,并将处理后的视频发送给客户端220,以便用户通过客户端220查看还原后的视频。可选地,解压缩系统210和客户端220的连接方式类似视频发送系统100中客户端220和解压缩系统210的连接和部署方式。
进一步地,解压缩系统210包括一个或多个装置,用于实现对压缩视频的解压缩,本申请实施例也不限定该解压缩系统210中一个或多个装置的部署位置以及形态。该视解压缩系统210中装置的部署位置以及形态与该视频编码系统120中装置的部署位置以及形态相似,具体可参见前述内容,此处不再赘述。
当该解压缩系统210包括多个装置时,各个装置可以采用分布式计算方式执行视频解压缩处理,也可以每个装置各自执行一个视频解压缩处理。举例来说,该解压缩系统210可以包括四个装置,分别为视频分解装置、解码装置以及视频重建装置。
视频分解装置在接收到压缩视频后,可以从该压缩视频中分解出第一数据流和第二数据流,在获得第一数据流和第二数据流之后,视频分解装置可以将该第一数据流和第二数据流发送给解码装置,解码装置分别为第一数据流和第二数据流进行解码获得第一视频和第二视频。解码装置将第一视频和第二视频发送给视频重建装置。视频重建装置具备基于参考的超分辨率重建能力,可以利用第一视频和第二视频生成与原视频相近的第三视频。
解压缩系统210中的多个装置可以部署在相同的系统或服务器中,如该多个装置可以分别部署在云计算设备系统、边缘计算设备系统或终端计算设备这三个环境中,也可以部署在这三个环境中的任意两个或一个中。
在对本申请实施例提供的一种视频传输方法进行说明之前,对本申请实施例涉及的几个概念进行说明:
1、视频、视频的分辨率以及视频的帧率。
视频包括多帧图像,该多帧图像按照播放顺序排列,可以构成一个图像序列。
视频的分辨率是指视频中图像的分辨率,图像的分辨率可以指示该图像中像素点的数量,以分辨率为M*N的图像为例,该图像是指包括M*N个像素点,其中长度方向的像素点数目为M,宽度方向的像素点数目为N。
视频的帧率是指播放该视频时,单位时间内可展示的图像的帧数(一帧图像即为一张图像)。
在本申请实施例中视频1、视频2、视频3、视频4以及视频5均为视频。
2、视频编码(也可以简称为编码)、数据流。
对于视频中播放顺序靠近的两帧图像,这两帧图像中包括的相似信息较多,这种相似信息即为冗余信息,通过视频编码能够去除原视频中的冗余信息,将一个视频转换为另一个格式的文件。在本申请实施例中视频经过编码之后生成的另一种格式的文件称为数据流,该数据流中包括多个数据段。视频中的每帧图像经过视频编码会生成该数据流中的一个数据段(也即数据流中的数据段与视频中的图像是对应的,每帧图像经过视频编码会生成对应的数据段)。
将数据流复原成视频的过程即为视频解码,视频编码和视频解码的过程是互逆的,在 视频解码的过程中,可以将数据流中的每个数据段复原为视频中的一帧图像。
在本申请实施例中数据流1、数据流2以及压缩视频均为数据流。
下面结合图2对本申请实施例提供的一种视频处理方法进行说明,参见图2,该方法包括:
步骤201:视频发送系统100获取原视频。
该原视频包括多帧图像,在该原视频中该多帧图像按照视频播放时间排序构成一个图像序列。
本申请并不限定视频发送系统100获取原视频的方式,例如该原视频可以是具备视频拍摄功能的客户端(如摄像器、数码相机、监控装置)拍摄的视频。该原视频可以是具备视频处理功能客户端(如安装有视频剪辑软件的计算设备)处理后的视频。
视频发送系统100在获取原视频后,可以采用两种压缩方式对该原视频进行处理,一种是在时间维度上对原视频进行的图像抽取(图像抽取后获得视频1)的压缩方式(参见步骤202~步骤203的说明),另一种是直接对原视频的下采样(下采样后获得视频2)的压缩方式(参见步骤204~步骤205的说明)。
步骤202:视频发送系统100按照预设的时间间隔从原视频抽取图像,生成视频1。
步骤202是在时间维度上进行的图像抽取,是减少原视频在单位时间内可播放的图像数量,也即降低原视频的帧率。
视频发送系统100可以间隔一定的时间间隔从原视频中抽取图像,将抽取的多帧图像组成视频1。例如,视频发送系统100可以每隔0.5秒抽取一帧图像。这里并不限定该时间间隔的具体值,该时间间隔可以是根据原视频帧率确定;例如,对于帧率较高的原视频,可以采用较小的时间间隔,对于帧率较低的原视频,可以采用较长的时间间隔。该时间间隔也可以是根据原视频所展示的内容确定;例如,对于各帧图像所展示的内容相似的原视频,可以采用较大的时间间隔,对于各帧图像所展示的内容差异较大的原视频,可以采用较小的时间间隔。
需要说明的是,由于该时间间隔是基于原视频中的图像播放时间为基础确定的。通过时间间隔与帧率的乘积可以计算出该时间间隔内所能够显示的图像,这里有所能显示的图像的数量为P为例,视频发送系统100可以间隔一定的时间间隔从原视频中抽取图像的过程,可以理解为每间隔P帧图像抽取一帧图像,抽取的多帧图像组成视频1。
如图3A所示,为对原视频在时间维度进行图像抽取的示意图,通过对原视频进行图像抽取,能够保持从原视频中抽取的图像的分辨率,减少原视频的帧率(单位时间内可播放的图像数量),获得了视频1。
从图3A所示中可以看出,视频1中图像的帧数低于原视频中的图像的帧数,视频1中每一帧图像是原视频中的一帧图像,也就是说,原视频中总是存在一帧图像与视频1中一帧图像相同。视频1中一帧图像与原视频中与该帧图像相同的一帧图像是对应的。
步骤203:视频发送系统100对视频1进行编码,获得数据流1。
视频发送系统100在获取了视频1后,可以对该视频1进行编码,这里并不限定对视频1进行编码的方式,例如可以基于业界通用标准(H264、或H265)对视频1进行编码。
视频发送系统100对视频1进行编码时,是对视频1中的每帧图像进行编码,视频发 送系统100可以对视频1中的一帧图像进行编码,生成该数据流1中的一个数据段,视频1每帧图像编码产生的数据段构成数据流1,视频1中每帧图像的排序方式,与该数据流1中每帧图像编码产生的数据段的排序方式一致。
综上,按照一定的时间间隔从原视频中抽取图像可以生成视频1,通过对视频1进行编码可以生成数据流1,也即数据流1是原视频经过图像抽取、编码后生成的。
步骤204:视频发送系统100对原视频进行下采样,生成视频2,该视频2包括多帧图像,该多帧图像是对原视频中的多帧图像降低了分辨率后的图像。
在步骤202中执行的下采样是对视频中每一帧图像的下采样(subsampled),对一帧图像的下采样可以理解为按照预设倍率对该图像进行缩小,也即降低图像的分辨率。
对于分辨率为M*N的图像1,其中M是指图像1在长度方向上像素点的数量,N是指图像1在宽度方向上像素点的数量,对该图像1执行倍率为S的下采样,可以得到分辨率为M/S和N/S的图像2,也即图像2在长度方向上的像素点的数量为M/S,图像2在宽度方向上的像素点的数量为N/S。也就是说,在对该图像1执行倍率为S的下采样时,在长度方向上,将S个像素点缩小为1个像素点,在宽度方向上将S个像素点缩小为1个像素点,这样能够在长度方向和宽度方式均缩小S倍。对图像1执行倍率为S的下采样也就是将图像1中S*S的像素点矩阵变为一个像素点,可以得到图像2。
需要说明的是,这里仅是以长度方向和宽度方式缩小的倍数相同,均为S为例,在实际应用中,也可以在长度方向和宽度方式采用不同的缩小倍数,例如,在对图像1进行下采样的过程中,在长度方向缩小S1倍,也即在长度方向上将S1个像素点缩小为1个像素点,在宽度方式缩小S2倍,也即在长度方向上将S2个像素点缩小为1个像素点。综合长度方向和宽度方向,长度方向和宽度方向倍数不同的下采样是将图像1中S1*S2的像素点矩阵变为一个像素点。
本申请并不限定下采样所采用的倍率的具体数值,该倍率可以为固定值,是视频发送系统100默认的值。该倍率也可以是根据该原视频所呈现的内容确定的值,例如,当该原视频的相邻图像帧的相似度较大时,可以采用较大的倍率;当该原视频的相邻图像帧的相似度较小时,可以采用较小的倍率。在实际应用中,对于原视频中不同的图像可以采用不同的倍率。举例来说,若原视频中排序靠前的多个图像帧相似度较大,超过一定阈值,可以采用较大的倍率,若原视频中排序靠后的多个图像帧由于相似度较小,可以采用较小的倍率。
本申请并不限定下采样的方式,包括但不限于:最大值采样(也即获取S*S的像素点矩阵中最大的像素值),平均值采样(也即获取S*S的像素点矩阵中像素值的平均值),求和区域采样(也即获取S*S的像素点矩阵中像素值的和值)或随机区域采样(也即随机获取S*S的像素点矩阵中一个像素点的像素值)、双线性插值法、双三次插值法、基于卷积神经网络的下采样。
如图3B所示,为对原视频进行下采样的示意图,通过对原视频进行下采样,降低了原视频中的各帧图像的分辨率,保持原视频的帧率(单位时间内可播放的图像数量),获得了视频2。
从图3B所示中可以看出,视频2中图像的帧数与原视频中图像的帧数相同,也即帧率 不变,视频2中的一帧图像是原视频中的一帧图像降低了分辨率后的图像。视频2中的一帧图像与原视频中、该帧图像降低分辨率之前的图像是对应的。
步骤205:视频发送系统100对视频2进行编码,生成数据流2。
视频发送系统100在获取了视频2后,可以对该视频2进行编码,这里并不限定对视频2进行编码的方式,例如可以基于业界通用标准(H264、或H265)对视频2进行编码。
在对视频2进行编码时,视频发送系统100可以直接对视频2进行编码,以获得数据流2。可选地,为了能够进一步减少视频发送系统100所发送的压缩视频的大小,视频发送系统100也可以对视频2中的图像进行筛选,对筛选了图像后的视频2进行编码,获得数据流2。
本申请实施例并不限定视频发送系统100对视频2的筛选方式,例如,视频发送设备100可以按照预设的时间间隔(该时间间隔为步骤202中涉及的时间间隔)从视频2中抽取图像,也即视频发送系统100在该视频2中删除部分图像,该部分图像可以包括一帧或多帧图像,该部分图像中的每帧图像在原视频中对应的图像与视频1的一帧图像在原视频中对应的图像相同。
视频发送系统100对视频2进行编码时,是对视频2(或筛选了图像后的视频2)中的每帧图像进行编码。视频发送系统100可以对视频2(或筛选了图像后的视频2)中的一帧图像进行编码,生成该数据流2中的一个数据段,视频2(或筛选了图像后的视频2)每帧图像编码产生的数据段构成数据流2,视频2(或筛选了图像后的视频2)中每帧图像的排序方式,与该数据流2中每帧图像编码产生的数据段的排序方式一致。
综上,对原视频进行下采样可以生成视频2,通过对视频2(或筛选了图像后的视频2)进行编码可以生成数据流2,也即数据流2是原视频经过下采样(可选的,还可以经过图像筛选)、编码后生成的。
步骤206:视频发送系统100在获得数据流1和数据流2后,可以将数据流1和数据流2混合,生成压缩视频。
数据流1对视频1编码后的数据流,也即数据流1中的一个数据段是视频1中一帧图像经过视频编码后生成的,该数据段与视频1中的该图像是对应的。又由于视频1中一帧图像与原视频中的一帧图像相同,也即视频1中的一帧图像与原视频中的一帧图像对应,则该数据1中一个数据段可以理解为原视频中的一帧图像编码后生成的,也即该数据流1中的一个数据段与原视频的一帧图像存在对应关系。
类似的,数据流2包括对视频2编码后的数据流,也即数据流2中的一个数据段是视频2中一帧图像经过视频编码后生成的,该数据段与视频2中的该图像是对应的。又由于视频2中一帧图像是原视频中一帧图像降低分辨率后的图像,也即视频2中的一帧图像与原视频中的一帧图像对应,则该数据2中一个数据段可以理解为原视频中的一帧图像降低分辨率后经过编码生成的,也即该数据流1中的一个数据段与原视频的一帧图像存在对应关系。
数据流1和数据流2的混合是指将该数据流1中的多个数据段与数据流2中的多个数据段进行混合。
本申请实施例并不限定数据流1和数据流2的混合方式,视频发送系统100可以将数 据流1放置在数据流2之后,生成压缩视频。又例如,视频发送系统100也可以将数据流2放置在数据流1之后。
视频发送系统100也可以将数据流1中的数据段嵌入到数据流2中。如图4所示,视频发送系统100在将数据流1和数据流2混合时,对于数据流1中的每个数据段,可以将该数据段嵌入到数据流2的目标位置处,数据流1中该数据段对应的图像与视频中距离该目标位置最近的数据段对应的图像在原视频中的位置上一致或相邻。也即在播放原视频时,数据流1中该数据段对应的图像、以及数据流2中距离该目标位置最近的数据段对应的图像的播放时间相同或数据流1中该数据段对应的图像、以及数据流2中距离该目标位置最近的数据段对应的图像的播放时间相近,也即播放时间的差值处于预设范围内。
也就是说,数据流1中该数据段对应的图像与数据流2中距离目标位置最近的数据段对应的图像之间存在时间关联性。该时间关联性表现在原视频中该数据流1中该数据段对应的图像、以及数据流2中距离该目标位置最近的数据段对应的图像的播放时间相同或相近(相近可以理解为播放时间的差值处于预设范围内)。
本申请实施例中并不限定目标位置的确定方式,例如,由于数据流1是视频1经过编码后的数据流,视频1是原视频按照预设的时间间隔抽取图像后生成的,而原视频按照预设的时间间隔抽取图像的过程可以为每间隔P帧图像抽取图像的过程,根据视频1中图像与数据流1中数据段之间的对应关系、以及视频2中图像与数据流2中数据段之间的对应关系,可以在数据流2中每隔P个数据段确定一个目标位置,每确定一个目标位置,将数据流1中的一个数据段嵌入到该目标位置处。
为了使得视频接收系统200能从压缩视频中区分出数据流2与数据流1,视频发送系统100还可以在压缩视频的数据中添加标识,该标识可以指示该压缩视频中属于数据流2的数据段,也可以指示该压缩视频中属于数据流1的数据段。
该标识指示该压缩视频中属于数据流2的数据段或指示该压缩视频中属于数据流1的数据段的方式有很多种,本申请实施例并不限定该标识的指示方式,也不限定该标识的具体形态。
以视频发送系统100将数据流2放置在数据流1之后生成压缩视频的方式为例,可以在数据流1的最后一个数据段后添加该标识,通过标识压缩视频中数据流2和数据流1的分界位置指示该压缩视频中区分属于数据流2的数据段和属于数据流1的数据段。此时,压缩视频中位于该分界位置之前的数据段为数据流1中的数据段,位于该分界位置之后的数据段为数据流2中的数据段。
可选地,除了上述方案中通过标识压缩视频中两个数据流的分界位置区分两个数据流,该标识也可以为独立于压缩视频的信息,视频发送系统100可以在发送压缩视频时,向视频接收系统200发送该标识。
该标识还可以为添加到该压缩视频中一个数据段或多个数据段中,如位于一个数据段或多个数据段的头部或尾部,作为压缩视频中的组成部分。例如,视频发送系统100可以将该标识添加到位于分界位置处的数据段的头部,该数据段可以为压缩视频中属于数据流1的最后一个数据段,或压缩视频中属于数据流2的第一个数据段。
作为一种可能的实施例,本申请实施例的压缩视频中可以包括多个标识,一个标识位 于压缩视频的一个数据段中,如位于该数据段内部,如位于该数据段的头部、尾部;该标识用于指示所在的数据段属于数据流2或数据流1。该标识可以是在视频发送系统100在生成数据流2和数据流1时生成的。例如,视频发送系统100在生成数据流1时,每生成一个数据段,在该数据段中添加一个标识,该标识用于指示所在数据属于数据流1。视频发送系统100也可以在生成数据流2时,每生成一个数据段,在该数据段中添加一个标识,该标识用于指示所在数据属于数据流2。
在本申请实施例中,压缩视频中数据段中包括标识的方式有很多,例如,在压缩视频中每个数据段中可以包括一个标识,该标识用于指示所在的数据段属于数据流2或数据流1;又例如,在压缩视频中属于数据流2的每个数据段中可以包括一个标识,属于数据流1的每个数据段中可以不包括标识,该标识用于指示所在的数据段属于数据流2;又例如,在压缩视频中属于数据流1的每个数据段中可以包括一个标识,属于数据流2的每个数据段中可以不包括标识,该标识用于指示所在的数据段属于数据流1。
这里并不限定该标识指示所在的数据段属于数据流2或数据流1的方式,例在该数据段的头部增加一个字段,该字段为第一值时指示该数据段属于数据流1,该字段为第二值时指示该数据段属于数据流2。
同样的,这里也不限定该标识指示所在的数据段与相邻数据段的分割位置,例如在该数据段的头部再增加一个字段,该字段的值P可以指示所在数据段的字节数,表明该字段之后的P个字节的数据为该数据段的数据。
下面结合具体示例进一步解释本申请的视频处理方法,如图5所示,对视频生成压缩视频的方式进行说明,视频发送系统100在获取原视频后,可以通过下采样生成视频2,通过抽取图像生成视频1。视频发送系统100之后分别为视频2和视频1进行编码,生成数据流2和数据流1,之后根据数据流2和数据流1生成压缩视频。
步骤207:视频发送系统100向视频接收系统200发送压缩视频。
步骤208:视频接收系统200接收到压缩视频后,可以先对压缩视频进行分解,生成视频3以及视频4。
步骤208与步骤206为逆过程,视频接收系统200可以按照压缩视频的生成方式,对压缩视频进行分解,视频接收系统200可以先从压缩视频中分解出数据流1和数据流2。之后,再分别对数据流1和数据流2进行解码,生成视频3和视频4。
视频接收系统200从压缩视频中分解出数据流2和数据流1时,可以是根据标识将压缩视频分解为数据流2和数据流1。
标识的指示方式不同,视频接收系统200分解的方式也不同;例如,当该标识通过表征该压缩视频中数据流2和数据流1的分界位置指示该压缩视频中属于数据流2的数据段或指示该压缩视频中属于数据流1的数据段时,视频接收系统200根据该标识确定该压缩视频中数据流2和数据流1的分界位置,进而确定出属于数据流2的数据段和属于数据流1的数据段。视频接收系统200从压缩视频中截取属于数据流2的数据段,获得数据流2;视频接收系统200从压缩视频中截取属于数据流1的数据段,获得数据流1。
又例如,当压缩视频流中的多个数据段中包括该标识,每个标识用于指示所在的数据段属于数据流1或数据流2,视频接收系统200对该压缩视频中的每个数据段中的标识进 行解析,进而确定属于数据流2的数据段以及属于数据流1的数据段,进而获得数据流2和数据流1。
视频接收系统200在获取了数据流2和数据流1后,可以分别对该数据流2和数据流1进行解码,这里并不限定对数据流2和数据流1进行解码的方式,对数据流2和数据流1进行解码的方式与视频发送系统100对视频2和视频1进行编码的方式对应,例如可以基于业界通用标准(H264、或H265)对数据流2和数据流1进行解码。
如图6所示,为视频接收系统200对压缩视频分解获取视频3和视频4示意图,首先,视频接收系统200可以将压缩视频中的数据段进行分解,获取数据流2和数据流1,之后,分别对数据流2和数据流1进行解码,获得视频4和视频3。
需要说明的是,视频1是数据流1编码前的视频,视频3是数据流1解码后的视频,理论上,视频1和视频3相同;但在考虑数据编码以及解码过程中可能存在一些信息丢失,视频1和视频3可能存在较少的差异。类似的,视频2是数据流2编码前的视频,视频4是数据流2解码后的视频,理论上,视频2和视频4相同;但在考虑数据编码以及解码过程中可能存在一些信息丢失,视频2和视频4可能存在较少的差异。
综上,视频3是数据流1解码后的视频,视频3和视频1相似或相同,也即该视频3可以看做是通过对原视频抽取图像后获得的视频,视频3保持了原视频的分辨率。视频4是数据流2解码后的视频,视频4和视频2相似或相同,也即该视频4可以看做是原视频经过下采样(可选的,还包括图像筛选)后获得的视频,视频4保持了原视频的帧率。
步骤209:视频接收系统200根据视频3和视频4,生成视频5。
由于视频3是由按照预设的时间间隔从原视频抽取图像后获得的视频,获取视频3的过程仅降低了原视频的帧率,并未改变原视频中每帧图像的分辨率;而视频4则是在原视频基础上分别对每帧图像进行下采样,获取的视频4的过程降低了原视频中每帧图像的分辨率,保证了帧率;所以,视频接收系统200还原原视频的过程可以以视频3为基础,结合视频4进行视频还原,具体的,可以先将视频4还原成分辨率较高的候选视频,该候选视频中包括多个候选图像,之后,根据视频3和候选视频获得分辨率和帧率均接近原视频的视频5。
将视频4中的图像分为两类,第一类为视频3中所包括的图像降低了分辨率的图像,第二类为原视频中除了视频3所包括的图像外的图像降低了分辨率的图像。
具体还原过程包括两部分,一部分是针对第一类图像的还原,另一部分是为针对第二类图像的还原。
针对第一类图像的还原,可以直接将视频3中的图像作为该类图像提升了分辨率后的图像,也即还原后的图像。
针对第二类图像的还原,针对该类图像中的任一帧图像,可以从视频3中选择一帧图像作为参考图像,先将视频4中的该帧图像进行单幅图像超分辨率(single image super-resolution,SISR),生成一帧候选图像,对参考图像和以及该帧候选图像进行分析,确定参考图像和该帧候选图像中存在差异的图像特征以及相同的图像特征,保留相同的图像特征,对于存在差异的图像特征进行融合。通过保留相同的图像特征以及融合存在差异的图像特征获得视频5中的一帧图像。
其中,单幅图像超分辨率重建是指基于图像的分析,将低分辨率的图像转换为更高分辨率图像的技术。SISR包括但不限于超分辨卷积神经网络(super-resolution convolutional neural network,SRCNN)、深度卷积网络(very deep convolutional networks,VDSR)、单图像超分辨率增强深残差网络(enhanced deep residual networks for single image super-resolution,EDSR)。
存在差异的图像特征的融合可以理解为以特征的权重保留参考图像和该帧候选图像中的图像特征,例如,可以为参考图像和候选图像配置对应的权重,通过权重与图像特征的乘积的和值实现图像特征的融合。
本申请实施例并不限定参考图像的确定方式,例如该参考图像可以是视频3中与视频4中该帧图像相似度大于阈值的图像。又例如,该参考图像、与视频4中该帧图像在原视频中的位置相邻或一致。又例如,将视频3中与该视频4中的该帧图像存在时间关联性的图像作为参考图像,也即该参考图像、与视频4中该帧图像在原视频中对应的图像的播放时间的间隔处于相近,处于预设范围内。
针对第一类图像和第二类图像的还原也可以借助已训练好的模块,视频接收系统200可以预先部署有基于参考的超分辨率重建模块,该模块具备基于参考的超分辨率重建(reference-based super-resolution,refSR)的功能,能够以高分辨率的视频3为参考,对低分辨率的视频4进行重建,最终输出高分辨率的视频5。具体到视频中的图像,基于参考的超分辨率重建模块可以将视频3中的一帧图像作为参考图像,对视频4中的一帧图像进行重建,生成视频5中的一帧图像。
从视频3和视频4的生成方式中可以看出,视频3中图像为原视频中的原始图像,视频4中的图像为原视频中的图像降低分辨率后的图像,该基于参考的超分辨率重建模块可以利用视频3中的图像为参考图像,对视频4中的图像进行重建。
视频接收系统200可以为视频4中的每一帧图像从视频3中确定一帧图像作为参考图像,这里并不限定视频接收系统200为视频4的每一帧图像确定参考图像的方式可以参见前述说明,此处不再赘述。
由于视频4中图像的帧数大于视频3中的图像的帧数,在实际应用中,允许视频4中多帧图像对应的参考图像相同。
如图7所示,为视频接收系统200为视频4的每一帧图像从视频3中选择了一帧作为参考图像之后,可以将该参考图像放在视频4中的该帧图像之后,生成一个新的视频。
该新的视频中每帧低分辨率的图像(该图像为属于视频4中的图像)之后有一帧高分辨率的图像(该图像为属于视频3中的图像),将该新的视频可以作为基于参考的超分辨率重建模块的输入。
基于参考的超分辨率重建模块的结构如图8所示,首先,基于参考的超分辨率重建模块中的单图超分辨重建模块对该新的视频中的低分辨率图像进行单幅图像超分辨率重建获得候选图像,再通过特征选择模块分别从候选图像和低分辨率图像的参考图像中确定这两个图像之间的存在差异的图像特征以及这两个图像之间的相同的图像特征。特征融合模块通过存在差异的图像特征进行融合,保留这两个图像之间相同的图像特征,进而获得视频5中的图像。
本申请实施例并不限定基于参考的超分辨率重建模块具体形态,例如该基于参考的超分辨率重建模块可以为神经网络模型,通过预先的训练学习能够实现基于参考的超分辨率重建。又例如,该基于参考的超分辨率重建模块也可以根据基于参考的超分辨率重建背后的信号处理机制实现的refSR计算库。
基于前述说明,下面对视频接收系统200对压缩视频的处理方式进行综合描述,如图9所示,为本申请实施例中视频接收系统200对压缩视频的处理方式,在图9中,视频接收系统200在接收到压缩视频后,分解出数据流1和数据流2,之后对数据流1和数据流2分别进行视频解码获得视频3和视频4。在获得视频3和视频4之后,可以对视频3和视频4进行整合形成新的视频,该新的视频输入到基于参考的超分辨率重建模块之后,输出获取视频5。
通过上述还原过程,得到的视频5既能够保证视频5中图像的分辨率与原视频保持一致,也能够保证该视频5中的帧率与原视频的帧率一致或存在较小差异,使得视频5更接近与原视频。
在本申请实施例中,视频发送系统100能够对原视频采用两种不同的方式压缩,使得生成的压缩视频中能够携带与原视频相关的较多的信息。对应的,当视频接收系统200接收到该压缩视频后,对该压缩视频进行解码获得两种不同的视频,之后再利用生成的两种不同视频还原出与原视频分辨率以及帧率差异较小的视频,这种视频处理方法在保证压缩视频较小的情况下,还能够保证在视频接收系统200侧能够还原出接近与原视频的视频,视频还原程度较高,方式更加简单高效。
应需理解的是,在本申请实施例中以视频发送系统100以及视频接收系统200进行交互为例进行说明的,在实际应用中,视频发送系统100以及视频接收系统200单侧执行的方法也可以独立存在,也即一个处理装置可以采用图2所示的实施例中视频发送系统100所执行的方法通过对原视频压缩获得压缩视频,保存该压缩视频,以减少保存原视频所需要的空间,也即通过保存该压缩视频来保存该原视频。当需要获得该原视频的情况下,该处理装置可以采用图2所示的实施例中视频接收系统200所执行的方法通过对压缩视频解压缩获得与原视频接近的视频5,方法简单高效,且视频还原程度高。
基于与方法实施例同一发明构思,本申请实施例还提供了一种解压缩装置1000,该解压缩装置1000用于执行上述方法实施例中视频接收系统执行的方法。如图10所示,解压缩装置1000包括获取单元1001、解码单元1002以及还原单元1003。具体地,在解压缩装置中,各单元之间通过通信通路建立连接。
获取单元1001,用于获取待处理的压缩视频,压缩视频包括对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段。
解码单元1002,用于解码压缩视频获得第一视频和第二视频,第一视频包括解码压缩视频中采用第一压缩方式获得数据段的结果;第二视频包括解码压缩视频中采用第二压缩方式获得数据段的结果。
还原单元1003,用于根据第一视频和第二视频确定第三视频,第三视频与原视频的图像序列相同。
应理解的是,本申请实施例的解压缩装置1000中的各个单元可以通过专用集成电路 (application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以当通过软件实现图2所示的视频处理方法时,解压缩装置1000以及解压缩装置1000中的各个单元可以为软件模块。
作为一种可能的实施方式,解码单元1002在解码压缩视频获得第一视频和第二视频时,可以将压缩视频分解为第一数据流和第二数据流;之后,对第一数据流进行解码,生成第一视频;对第二数据流进行解码,生成第二视频。
作为一种可能的实施方式,还原单元1003在根据第一视频和第二视频确定第三视频时,可以根据第二视频获得第七视频,第七视频中图像的分辨率高于第二视频的图像的分辨率;之后,再根据第一视频和第七视频生成第三视频,第七视频的分辨率和原视频的分辨率的差异小于第一阈值,第三视频的帧率和原视频的帧率差异小于第二阈值。
作为一种可能的实施方式,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
作为一种可能的实施方式,解码单元1002在将压缩视频分解为第一数据流和第二数据流时,可以根据压缩视频中的标识将压缩视频分解为第一数据流和第二数据流。
本申请实施例所提供的解压缩装置1000可对应于执行本申请如图2所述的实施例中视频接收系统所执行的方法,并且解压缩装置1000中各个单元的上述和其它操作和/或功能分别为了实现图2中步骤208~步骤209中方法的相应流程,为了简洁,具体可参见前述方法实施例中的说明,在此不再赘述。
在本申请实施例中,当解压缩装置1000在接收到该压缩视频后,解码该压缩视频,生成两种不同的视频,之后基于该两种不同视频还原出与原视频分辨率以及帧率差异较小的视频,解压缩装置1000能够解析该压缩视频,获得的两种不同的视频在不同程度上携带了原视频的相关信息,使得最终可以还原出与原视频较为接近的视频,视频还原质量显著提升。
基于与方法实施例同一发明构思,本申请实施例还提供了一种压缩装置1100,该解压缩装置1100用于执行上述方法实施例中视频发送系统执行的方法。如图11所示,压缩装置1100包括第一压缩单元1101、第二压缩单元1102以及混合单元1103。具体地,在解压缩装置1100中,各模块之间通过通信通路建立连接。
第一压缩单元1101,用于采用第一压缩方式对原视频进行压缩,第一压缩方式为对原视频下采样的压缩方式。
第二压缩单元1102,用于采用第二压缩方式对原视频进行压缩,第二压缩方式为按照预设的时间间隔从原视频抽取图像的压缩方式。
混合单元1103,用于根据第一压缩单元1101和第二压缩单元1102对原视频压缩后的数据,获取压缩视频。
应理解的是,本申请实施例的压缩装置1100中的各个单元可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical  device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以当通过软件实现图2所示的视频处理方法时,压缩装置1100以及压缩装置中1100的各个单元可以为软件模块。
作为一种可能的实施方式,第一压缩单元1101可以采用第一压缩方式压缩原视频,生成第一数据流,第一数据流包括至少一个数据段。第二压缩单元1102可以采用第二压缩方式压缩原视频,生成第二数据流,第二数据流包括至少一个数据段。混合单元1103可以根据第一数据流和第二数据流获得压缩视频。
作为一种可能的实施方式,第一压缩单元1101在采用第一压缩方式压缩原视频,生成第一数据流时,可以对原视频中的图像进行下采样,生成第四视频;对第四视频中的图像进行筛选,获得第五视频;对第五视频进行编码,生成第一数据流。
作为一种可能的实施方式,第二压缩单元1102在采用第二压缩方式压缩原视频,生成第二数据流时,可以按照预设的时间间隔从原视频中抽取图像,生成第六视频;对第六视频进行编码,生成第二数据流。
作为一种可能的实施方式,压缩视频中还包括标识,标识用于指示压缩视频中属于第一数据流的数据段或属于第二数据流的数据段。
本申请实施例提供的压缩装置1100可对应于执行本申请如图2所述的实施例中视频发送系统所执行的方法,并且压缩装置1100中各个单元的上述和其它操作和/或功能分别为了实现图2中步骤201~步骤207中方法的相应流程,为了简洁,具体可参见前述方法实施例中的说明,在此不再赘述。
在本申请实施例中,压缩装置能够对原视频采用两种不同的方式压缩,使得生成的压缩视频既包括高分辨率的数据流,又包括与原视频的帧率相同的数据流,使得解压装置可以根据高分辨率的数据流和帧率不变的数据流分别还原原视频。便于压缩视频的接收端能够基于此还原出与原视频较为接近的视频。压缩装置所采用的压缩方式也相对简单。
基于与方法实施例同一发明构思,本申请实施例还提供了一种视频处理装置,该数据处理装置用于执行上述方法实施例中视频接收系统以及视频发送系统执行的方法。该视频处理装置可以包括前述说明中解压缩装置中的各个单元以及压缩装置中的各个单元,解压缩装置中的各个单元以及压缩装置中的各个单元的功能可以参见前述内容,此处不再赘述。
本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成为一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
该集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台终端设备(可以是个人计算机,手机,或者网络设备等)或处理器(processor)执行本申请各个实施例该方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种 可以存储程序代码的介质。
本申请还提供如图12所示的计算设备1200。所述计算设备1200包括总线1201、处理器1202、通信接口1203和存储器1204。处理器1202、存储器1204和通信接口1203之间通过总线1201通信。
其中,处理器1202可以为中央处理器(central processing unit,CPU)。存储器1204可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1204还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,HDD或SSD。存储器中存储有可执行代码,处理器1202执行该可执行代码以执行前述图2所描述的方法。存储器1204中还可以包括操作系统等其他运行进程所需的软件模块(如解压缩装置1000中的多个单元或压缩装置1100中的多个单元)。操作系统可以为LINUX TM,UNIX TM,WINDOWS TM等。图12中,仅示例性的绘制出了存储器1204中包括解压缩装置1000的多个单元。
当存储器1204中包括解压缩装置1000中的多个单元时,处理器1202可以调用存储器1204中的软件模块执行上述方法实施例中视频接收系统200所执行的方法。
当存储器1204中包括压缩装置1100中的多个单元时,处理器1202可以调用存储器1204中的软件模块执行上述方法实施例中视频发送系统100所执行的方法。
当存储器1204中包括解压缩装置1000和压缩装置1100中的多个单元时,处理器1202可以调用存储器1204中的软件模块执行上述方法实施例中视频接收系统200以及视频发送系统100所执行的方法。
作为一种可能的实施例,本申请还提供一种计算设备系统,所述计算设备系统包括至少两个如图12所示的计算设备1200。
任意两个计算设备1200之间通过通信网络通信,其中,一个计算设备上运行压缩装置1000,另一个设备上运行压缩装置1100,两个计算设备分别用于执行上述方法实施例中视频发送系统100或视频接收系统200执行的方法中相应主体的操作步骤。
上述各个附图对应的流程的描述各有侧重,某个流程中没有详述的部分,可以参见其他流程的相关描述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括计算机程序指令,在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例图2所述的流程或功能。
本申请还提供一种计算设备系统,该计算设备系统包括两个或两个以上虚拟机、容器等虚拟化形式的计算设备,每个虚拟机或容器分别用于实现上述方法实施例中视频发送系统100或视频接收系统200执行的方法中相应主体的操作步骤。其中,虚拟机或容器运行在计算设备系统的计算设备中,计算设备的结构可以参见图12。
在本申请实施例中,压缩视频的发送侧可以采用两种不同的方式对原视频进行压缩,以获得该压缩视频,可以有效减少该压缩视频的大小,同时也使得该压缩视频中可以携带较多该原视频的数据,这样,当压缩视频的接收侧接收到该压缩视频后,能够解码该压缩视频获得两种不同的视频,再利用该两种不同的视频还原出与原视频分辨率以及帧率差异 较小的视频,视频还原程度高,能够有效提升视频处理效率。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘(solid state drive,SSD)。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (20)

  1. 一种视频处理的方法,其特征在于,所述方法包括:
    获取待处理的压缩视频,所述压缩视频包括对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段;
    解码所述压缩视频获得第一视频和第二视频,所述第一视频包括解码所述压缩视频中采用所述第一压缩方式获得数据段的结果;所述第二视频包括解码所述压缩视频中采用所述第二压缩方式获得数据段的结果;
    根据所述第一视频和所述第二视频确定第三视频,所述第三视频与所述原视频的图像序列相同。
  2. 根据权利要求1所述的方法,其特征在于,在所述获取待处理的第一视频之前,所述方法还包括:
    分别采用所述第一压缩方式和所述第二压缩方式对所述原视频进行压缩,获得所述压缩视频,所述第一压缩方式为对所述原视频下采样的压缩方式;所述第二压缩方式为按照预设的时间间隔从所述原视频抽取图像的压缩方式。
  3. 根据权利要求2所述的方法,其特征在于,所述对所述原视频分别采用所述第一压缩方式和所述第二压缩方式获得所述压缩视频,包括:
    采用所述第一压缩方式压缩所述原视频,生成第一数据流,所述第一数据流包括至少一个数据段;
    采用所述第二压缩方式压缩所述原视频,生成第二数据流,所述第二数据流包括至少一个数据段;
    根据所述第一数据流和所述第二数据流获得所述压缩视频。
  4. 根据权利要求3所述的方法,其特征在于,采用所述第一压缩方式压缩所述原视频,生成第一数据流,包括:
    对所述原视频中的图像进行下采样,生成第四视频;
    对所述第四视频中的图像进行筛选,获得第五视频;
    对所述第五视频进行编码,生成所述第一数据流。
  5. 根据权利要求3所述的方法,其特征在于,采用所述第二压缩方式压缩所述原视频,生成第二数据流,包括:
    按照预设的时间间隔从所述原视频中抽取图像,生成所述第六视频;
    对所述第六视频进行编码,生成所述第二数据流。
  6. 根据权利要求3或4所述的方法,其特征在于,所述压缩视频中还包括标识,所述标识用于指示所述压缩视频中属于所述第一数据流的数据段或属于所述第二数据流的数据段。
  7. 根据权利要求1所述的方法,其特征在于,所述解码所述压缩视频获得第一视频和第二视频,包括:
    将所述压缩视频分解为第一数据流和第二数据流;
    对所述第一数据流进行解码,生成所述第一视频;
    对所述第二数据流进行解码,生成所述第二视频。
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述第一视频和所述第二视频确定第三视频,包括:
    根据所述第二视频获得第七视频,所述第七视频中图像的分辨率高于所述第二视频的图 像的分辨率;
    根据所述第一视频和所述第七视频生成所述第三视频,所述第三视频的分辨率和所述原视频的分辨率的差异小于第一阈值,所述第三视频的帧率和所述原视频的帧率差异小于第二阈值。
  9. 根据权利要求6所述的方法,其特征在于,所述将所述压缩视频分解为第一数据流和所述第二数据流,包括:
    根据所述压缩视频中的标识将所述压缩视频分解为第一数据流和所述第二数据流。
  10. 一种视频处理系统,其特征在于,所述系统包括获取单元、解码单元以及还原单元:
    所述获取单元,用于获取待处理的压缩视频,所述压缩视频包括对原视频分别采用第一压缩方式和第二压缩方式获得的多个数据段;
    所述解码单元,用于解码所述压缩视频获得第一视频和第二视频,所述第一视频包括解码所述压缩视频中采用所述第一压缩方式获得数据段的结果;所述第二视频包括解码所述压缩视频中采用所述第二压缩方式获得数据段的结果;
    所述还原单元,用于根据所述第一视频和所述第二视频确定第三视频,所述第三视频与所述原视频的图像序列相同。
  11. 根据权利要求10所述的系统,其特征在于,所述系统还包括第一压缩单元、第二压缩单元以及混合单元;
    所述第一压缩单元,用于采用第一压缩方式对原视频进行压缩,所述第一压缩方式为对所述原视频下采样的压缩方式;
    所述第二压缩单元,用于采用第二压缩方式对所述原视频进行压缩,所述第二压缩方式为按照预设的时间间隔从所述原视频抽取图像的压缩方式;
    所述混合单元,用于根据所述第一压缩单元和所述第二压缩单元对所述原视频进行压缩后的数据,获取压缩视频。
  12. 根据权利要求11所述的系统,其特征在于,所述第一压缩单元采用所述第一压缩方式压缩所述原视频获得的数据为第一数据流,所述第一数据流包括至少一个数据段;所述第二压缩单元采用所述第二压缩方式压缩所述原视频获得的数据为第二数据流,所述第二数据流包括至少一个数据段;
    所述混合单元,具体用于:根据所述第一数据流和所述第二数据流获得所述压缩视频。
  13. 根据权利要求12所述的系统,其特征在于,所述第一压缩单元在采用所述第一压缩方式压缩所述原视频,生成第一数据流时,具体用于:
    对所述原视频中的图像进行下采样,生成第四视频;
    对所述第四视频中的图像进行筛选,获得第五视频;
    对所述第五视频进行编码,生成所述第一数据流。
  14. 根据权利要求12所述的系统,其特征在于,所述第二压缩单元在采用所述第二压缩方式压缩所述原视频,生成第二数据流时,具体用于:
    按照预设的时间间隔从所述原视频中抽取图像,生成所述第六视频;
    对所述第六视频进行编码,生成所述第二数据流。
  15. 根据权利要求12或13所述的系统,其特征在于,所述压缩视频中还包括标识,所述标识用于指示所述压缩视频中属于所述第一数据流的数据段或属于所述第二数据流的数据段。
  16. 根据权利要求10所述的系统,其特征在于,所述解码单元在解码所述压缩视频获得 第一视频和第二视频时,具体用于:
    将所述压缩视频分解为第一数据流和所述第二数据流;
    对所述第一数据流进行解码,生成所述第一视频;
    对所述第二数据流进行解码,生成所述第二视频。
  17. 根据权利要求10所述的系统,其特征在于,所述还原单元在根据所述第一视频和所述第二视频确定第三视频,具体用于:
    根据所述第二视频获得第七视频,所述第七视频中图像的分辨率高于所述第二视频的图像的分辨率;
    根据所述第一视频和所述第七视频生成所述第三视频,所述第七视频的分辨率和所述原视频的分辨率的差异小于第一阈值,所述第三视频的帧率和所述原视频的帧率差异小于第二阈值。
  18. 根据权利要求15所述的系统,其特征在于,所述解码单元在将所述压缩视频分解为第一数据流和所述第二数据流,具体用于:
    根据所述压缩视频中的标识将所述压缩视频分解为第一数据流和所述第二数据流。
  19. 一种计算设备系统,其特征在于,所述计算设备系统包括处理器和存储器;
    所述存储器,用于存储计算机程序指令;
    所述处理器执行调用所述存储器中的计算机程序指令执行如权利要求1至9中任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如权利要求1至9任一项所述的方法。
PCT/CN2021/133650 2020-12-17 2021-11-26 一种视频处理方法、装置以及设备 WO2022127565A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011492788.4 2020-12-17
CN202011492788.4A CN114650426A (zh) 2020-12-17 2020-12-17 一种视频处理方法、装置以及设备

Publications (1)

Publication Number Publication Date
WO2022127565A1 true WO2022127565A1 (zh) 2022-06-23

Family

ID=81989897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/133650 WO2022127565A1 (zh) 2020-12-17 2021-11-26 一种视频处理方法、装置以及设备

Country Status (2)

Country Link
CN (1) CN114650426A (zh)
WO (1) WO2022127565A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116996617B (zh) * 2023-07-31 2024-09-13 咪咕音乐有限公司 一种视频彩铃的展示方法、装置、电子设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1219255A (zh) * 1996-01-30 1999-06-09 德莫格拉夫克斯公司 先进电视中时域和分辨率的分层方法
WO2005032138A1 (en) * 2003-09-29 2005-04-07 Koninklijke Philips Electronics, N.V. System and method for combining advanced data partitioning and fine granularity scalability for efficient spatio-temporal-snr scalability video coding and streaming
US20050219642A1 (en) * 2004-03-30 2005-10-06 Masahiko Yachida Imaging system, image data stream creation apparatus, image generation apparatus, image data stream generation apparatus, and image data stream generation system
CN101507278A (zh) * 2006-08-16 2009-08-12 微软公司 用于数字视频的可变分辨率编码和解码的技术
CN103379351A (zh) * 2012-04-28 2013-10-30 中国移动通信集团山东有限公司 一种视频处理方法及装置
CN110213626A (zh) * 2018-02-28 2019-09-06 Tcl集团股份有限公司 视频处理方法及终端设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1219255A (zh) * 1996-01-30 1999-06-09 德莫格拉夫克斯公司 先进电视中时域和分辨率的分层方法
WO2005032138A1 (en) * 2003-09-29 2005-04-07 Koninklijke Philips Electronics, N.V. System and method for combining advanced data partitioning and fine granularity scalability for efficient spatio-temporal-snr scalability video coding and streaming
US20050219642A1 (en) * 2004-03-30 2005-10-06 Masahiko Yachida Imaging system, image data stream creation apparatus, image generation apparatus, image data stream generation apparatus, and image data stream generation system
CN101507278A (zh) * 2006-08-16 2009-08-12 微软公司 用于数字视频的可变分辨率编码和解码的技术
CN103379351A (zh) * 2012-04-28 2013-10-30 中国移动通信集团山东有限公司 一种视频处理方法及装置
CN110213626A (zh) * 2018-02-28 2019-09-06 Tcl集团股份有限公司 视频处理方法及终端设备

Also Published As

Publication number Publication date
CN114650426A (zh) 2022-06-21

Similar Documents

Publication Publication Date Title
US20200280730A1 (en) Training end-to-end video processes
US10904541B2 (en) Offline training of hierarchical algorithms
EP2517470B1 (en) Systems and methods for video-aware screen capture and compression
US10880566B2 (en) Method and device for image encoding and image decoding
US8874531B2 (en) Methods and systems for encoding/decoding files and transmissions thereof
US11528493B2 (en) Method and system for video transcoding based on spatial or temporal importance
US10504246B2 (en) Distinct encoding and decoding of stable information and transient/stochastic information
WO2022111631A1 (zh) 视频传输方法、服务器、终端和视频传输系统
CN110740352B (zh) 显卡透传环境下基于spice协议的差异图像显示方法
WO2022127565A1 (zh) 一种视频处理方法、装置以及设备
CN113573059B (zh) 图像显示方法、装置、存储介质及电子装置
CN111263164A (zh) 一种高帧频视频并行编码及重组方法
CN110545446B (zh) 一种桌面图像编码、解码方法、相关装置及存储介质
KR101551915B1 (ko) 영상압축방법 및 영상압축장치
JP2022531032A (ja) 適応解像度ビデオコーディング
US20180139480A1 (en) Systems and methods for digital video sampling and upscaling
US20170201759A1 (en) Method and device for image encoding and image decoding
JP2017192080A (ja) 画像圧縮装置、画像復号装置、画像圧縮方法及び画像圧縮プログラム
CN105704215A (zh) 文件共享系统及相应的文件发送、接收方法及装置
CN108933945B (zh) 一种gif图片的压缩方法、装置及存储介质
CN104219537A (zh) 视频数据处理的方法、装置及系统
US20140281005A1 (en) Video retargeting using seam carving
CN110868614B (zh) 显卡透传环境下基于spice协议的差异图像显示系统
US20230016302A1 (en) Task-oriented dynamic mesh compression using occupancy networks
KR20240126443A (ko) Isobmff의 cmaf 스위칭 세트를 시그널링하는 방법 및 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21905494

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21905494

Country of ref document: EP

Kind code of ref document: A1