WO2020181540A1 - Video processing method and device, encoding apparatus, and decoding apparatus - Google Patents

Video processing method and device, encoding apparatus, and decoding apparatus Download PDF

Info

Publication number
WO2020181540A1
WO2020181540A1 PCT/CN2019/078050 CN2019078050W WO2020181540A1 WO 2020181540 A1 WO2020181540 A1 WO 2020181540A1 CN 2019078050 W CN2019078050 W CN 2019078050W WO 2020181540 A1 WO2020181540 A1 WO 2020181540A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
pixel accuracy
processed
target pixel
type
Prior art date
Application number
PCT/CN2019/078050
Other languages
French (fr)
Chinese (zh)
Inventor
孟学苇
郑萧桢
王苫社
马思伟
Original Assignee
北京大学
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学, 深圳市大疆创新科技有限公司 filed Critical 北京大学
Priority to PCT/CN2019/078050 priority Critical patent/WO2020181540A1/en
Priority to CN201980005058.6A priority patent/CN111567044A/en
Publication of WO2020181540A1 publication Critical patent/WO2020181540A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation

Definitions

  • the present invention relates to the field of communication technology, and in particular to a video processing method, device, encoding device and decoding device.
  • the terminal device is storing Or when transmitting a video, the video content needs to be encoded, and then the encoded video is stored or transmitted.
  • the encoded video is decoded and displayed through the decoding method corresponding to the encoding process.
  • inter-frame prediction In the process of encoding video, one of the key technologies is inter-frame prediction.
  • the main idea of inter-frame prediction is to obtain the predicted frame from the motion vector of the current frame and the reference frame in the video.
  • the selection of the pixel accuracy of the motion vector is directly related to the quality of inter-frame prediction, which in turn affects the quality of video coding. Therefore, in the field of digital video coding technology, how to choose the pixel accuracy during coding processing has become a hot issue in current research.
  • the embodiments of the present invention provide a video processing method, device, encoding device and decoding device, which can improve the encoding performance of terminal equipment.
  • an embodiment of the present invention provides a video processing method, including:
  • the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
  • embodiments of the present invention provide another video processing method, including:
  • the coded video includes identification information, determining that the video type corresponding to the coded video is a preset video type;
  • the target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  • an embodiment of the present invention provides a video processing device, including a determining unit and a processing unit:
  • the determining unit is used to determine the video type of the acquired video to be processed
  • a processing unit configured to, if the determining unit determines that the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
  • the processing unit is further configured to perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
  • an embodiment of the present invention also provides another video processing device, including a receiving unit and a processing unit:
  • a processing unit configured to determine that the video type corresponding to the coded video is a preset video type when the coded video includes identification information
  • the processing unit is further configured to decode the encoded video based on the target pixel accuracy set;
  • the target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  • an embodiment of the present invention provides an encoding device, which is characterized by comprising a memory and a processor, the memory is connected to the processor, the memory is used to store a computer program, and the computer program includes Program instructions, the processor is configured to call the program instructions to execute the video processing method of the first aspect described above.
  • an embodiment of the present invention provides a decoding device, which is characterized in that it includes a memory and a processor, the memory is connected to the processor, the memory is used to store a computer program, and the computer program includes Program instructions, the processor is configured to call the program instructions to execute the video processing method of the second aspect described above.
  • an embodiment of the present invention also provides a computer storage medium, in which a first computer program instruction is stored, and when the first computer program instruction is executed by a processor, it is used to execute the first aspect.
  • the video processing method; the computer storage medium also stores a second computer program instruction, when the second computer program instruction is executed by the processor, it is used to execute the video processing method of the second aspect.
  • the terminal device judges the acquired video type of the video to be processed, and if the video type of the video to be processed is a preset video type, the accuracy value of each pixel in the initial pixel accuracy set is increased , Obtain a target pixel accuracy set, and further, perform encoding processing on the video to be processed based on the target pixel accuracy set to obtain an encoded video.
  • the target pixel accuracy set used in the encoding process is determined according to the video type of the video to be processed, so that targeted pixel accuracy sets are selected for the to-be-processed videos of different video types. Can improve the quality of encoded video.
  • FIG. 1 is a scene diagram of drone aerial photography provided by an embodiment of the present invention
  • Figure 2a is a schematic diagram of a motion estimation provided by an embodiment of the present invention.
  • Figure 2b is a schematic diagram of determining a motion vector provided by an embodiment of the present invention.
  • Figure 3a is a schematic diagram of another motion estimation provided by an embodiment of the present invention.
  • Figure 3b is a schematic diagram of yet another motion estimation provided by an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a video processing method provided by an embodiment of the present invention.
  • Figure 5 is a schematic diagram of an encoding system provided by an embodiment of the present invention.
  • Figure 6 is an interaction diagram provided by an embodiment of the present invention.
  • Figure 7 is a schematic structural diagram of an encoding device provided by an embodiment of the present invention.
  • Fig. 8 is a schematic structural diagram of a decoding device provided by an embodiment of the present invention.
  • the embodiment of the present invention proposes a video processing method for the pixel precision selection problem in video encoding.
  • the method can set the pixel precision set for encoding processing according to the video type of the video to be processed, and can improve the video encoding. Performance.
  • the video processing method provided by the embodiment of the present invention may include: determining the acquired video type of the to-be-processed video; if the video type of the to-be-processed video is a preset video type, adding each pixel in the initial pixel accuracy set The accuracy value is modified to obtain a target pixel accuracy set; the to-be-processed video is encoded based on the target pixel accuracy set to obtain an encoded video.
  • the initial pixel accuracy set is modified accordingly according to the video type of the video to be processed to obtain a pixel accuracy set suitable for the video to be processed, which achieves targeted behavior
  • the target pixel precision set of the to-be-processed videos of different video types can be selected to improve the quality of the encoded video.
  • FIG. 1 is a scene diagram of drone aerial photography provided by an embodiment of the present invention
  • FIG. 1 includes a drone 101, a camera area 102, and a display device 103.
  • a camera device 1011 is mounted on the drone 101, and the camera device can be used to shoot videos and images.
  • the drone 101 can also be equipped with a pan/tilt 1012, and the camera device 1011 can pass through the cloud.
  • the station 1012 is mounted on the drone 101.
  • the camera area 102 includes vehicles, trees, and rivers.
  • the camera device 1011 captures the camera area 102 to obtain a video to be processed.
  • the initial pixel accuracy set can be set in the drone 101 by default. Each pixel accuracy in the initial pixel precision set may not be applicable to all video types. Therefore, after obtaining the video to be processed, the drone 101 is not used directly
  • the initial pixel accuracy set encodes the video to be processed, but determines the video type of the video to be processed, and further determines whether the video type to which the video to be processed belongs is suitable for encoding using the initial pixel accuracy set.
  • the to-be-processed video is encoded based on the initial pixel accuracy set; if it is determined that the video to be processed belongs to If the type is not suitable for using the initial pixel accuracy set, each pixel accuracy value included in the initial pixel accuracy set is modified so that each modified pixel accuracy value is applicable to the video type to which the video to be processed belongs, and each modified pixel The precision value constitutes the target pixel precision set.
  • the drone 101 encodes the video to be processed based on the target pixel accuracy set.
  • the UAV 101 sends the encoded video obtained by encoding the video to be processed to the decoding end.
  • the decoding end described here may be configured in the UAV 101 or may be independent of the UAV 101.
  • the decoding device, the decoding end uses the corresponding decoding strategy to decode the encoded video, and finally sends the decoded video to the display device 103.
  • the display device 101 can be an encoding device with a display screen. After receiving the decoding sent by the decoding end When the video is displayed, the decoded video can be displayed on the display screen so that the user can watch the video.
  • video refers to various technologies that capture, record, process, store, transmit, and reproduce a series of static images in the form of electrical signals.
  • the original video captured by a camera or video camera or other shooting device Contains a lot of redundant information, so the amount of uncompressed video data is very large, it is difficult to store, and it is not convenient to transmit on the network.
  • the data volume of one second of digital TV video is about 1113KB. If the transmission bandwidth is 1M and the bit rate is 9123840, it takes 9 seconds to transmit one second of digital TV video, that is, the user wants to watch one second Zhong's digital TV video needs to wait up to 9 seconds, which greatly reduces the user experience.
  • the data volume of an uncompressed 10-second video is about 2.4G. Assuming a mobile phone with 16G memory, excluding the part occupied by the system, the remaining storage space is at most 12G and can only store at most 50 seconds. video.
  • the so-called compression processing on the original video is to remove a large amount of redundant information contained in the original video, such as temporal redundancy, visual redundancy, and spatial redundancy.
  • the process of compressing the original video is essentially The process of video encoding.
  • the video to be processed is the original video
  • one of the key technologies in the video encoding process is inter-frame prediction.
  • the inter-frame prediction technique utilizes the temporal correlation between adjacent frames of the video. , Use the previously coded reconstructed frame as the reference frame, and predict the current frame through motion estimation and motion compensation, thereby removing the temporal redundant information of the video.
  • the theoretical basis of inter-frame prediction is that there is a certain correlation between the scenes in the adjacent frames of the moving image. When encoding, it is not necessary to transmit all the information of each frame, but only the difference between frames. OK.
  • Video can be regarded as composed of multiple frames of images.
  • Encoding the video refers to encoding each frame of image included in the video.
  • the frame image is first divided into multiple coding regions, and then each coding region is divided into multiple coding units, and each coding unit includes multiple coding units. For each coding block, perform inter-frame prediction in turn.
  • the relative position is called Motion Vector (MV) (for the convenience of description, the process of determining the motion vector is called motion estimation below); according to the motion vector, the related information of the motion vector and the reference frame, the prediction block corresponding to the target encoding block is obtained.
  • MV Motion Vector
  • a similar process can obtain the prediction block corresponding to each coding block of the current frame, so that the prediction frame of the current frame can be obtained.
  • the above-mentioned related information of the motion vector includes the pixel accuracy (also can be understood as the pixel accuracy of the motion vector) used in the motion estimation process, the motion vector difference (MVD), etc.
  • MVD refers to the difference between a motion vector obtained through a motion estimation process and a motion vector prediction (MVP).
  • MVP motion vector prediction
  • the MVP uses multiple adjacent coded blocks and multiple current coded blocks. MV is calculated. The larger the pixel accuracy value, the lower the accuracy of the pixel accuracy, the lower the accuracy of the motion estimation, the smaller the pixel accuracy value, the higher the accuracy of the pixel accuracy, and the higher the accuracy of motion estimation.
  • Figures 3a and 3b are schematic diagrams of two kinds of motion estimation provided by the embodiments of the present invention.
  • the black dots represent the whole pixel
  • the white dots represent 1/2 pixel
  • 301 represents the current A target coding block in the frame
  • 302 represents a similar block in the reference frame.
  • the arrow 303 represents the motion vector corresponding to the target coding block, that is, the position difference between the target coding block in the previous frame image and the current frame image.
  • the pixel precision used in Figure 3b is 1/2 pixel precision
  • 304 represents the motion vector corresponding to the target coding block. From the comparison between Figure 3a and Figure 3b, it can be seen that the motion vector represented by 304 is more accurate than the motion vector represented by 303.
  • the encoding device can use Adaptive Motion Vector Resolution (AMVR) technology to determine the pixel accuracy of the motion vector used in the inter-frame prediction process.
  • AMVR Adaptive Motion Vector Resolution
  • the main principle of the AMVR technology is: the encoding device can set a set of pixel precisions, and the set of pixel precisions can include at least two pixel precisions.
  • the encoding device sets a pixel accuracy set, and when encoding a video, selects an appropriate pixel accuracy for each coding unit from the pixel accuracy set for encoding, so as to ensure that while removing visual redundancy in the video, It also reduces the amount of data processed by the encoding device and saves some terminal power consumption.
  • the pixel accuracy set may be (integer pixel accuracy, 1/2 pixel accuracy, 1/4 pixel accuracy), or the pixel accuracy set may also be (integer pixel accuracy, 4 pixel accuracy, and 1/4 pixel accuracy).
  • video types can include natural video and screen content video.
  • Natural video refers to a video that is obtained by shooting certain scenes through a camera device without other processing; screen content video generally refers to the video displayed on the screen of an encoding device Content, mainly including computer screens, TV screens, mobile phone screens and other content video.
  • This type of video includes not only some natural images, but also some visual content generated by computers such as text, graphics, animation, and games. It is a kind of video formed by a mixture of natural and artificial images. Compared with natural videos, screen content videos often have steep edges, high-purity colors, strong contrasts, etc., as well as more regular and simple motion information.
  • this application sets the above pixel precision set by the encoding device
  • the set (for example, the integer pixel precision, 1/2 pixel precision and 1/4 pixel precision as mentioned above) is called the initial pixel precision set.
  • the initial pixel precision set When the screen content video is encoded, the precision value of each pixel in the initial pixel precision set will be calculated Make modifications to obtain the target pixel accuracy set suitable for the screen content video.
  • the initial pixel accuracy set set by the encoding device may be suitable for screen content videos. If the video to be processed is a natural video, it is also necessary to modify the accuracy of each pixel in the initial pixel accuracy set to obtain a target suitable for natural video. Pixel accuracy collection.
  • FIG. 4 is a video processing method provided by an embodiment of the present invention.
  • the video processing method can be used in any encoding device capable of implementing an encoding function, and the video processing method can be specifically executed by a processor of the encoding device.
  • the video processing method may include the following steps:
  • Step S401 The encoding device determines the video type of the acquired video to be processed.
  • the to-be-processed video acquired by the encoding device may be obtained by photographing the camera object through a camera device configured on the encoding device, or the to-be-processed video may also be an independent camera device that captures the camera. The subject is photographed and sent to the encoding device.
  • the video formats obtained by shooting the camera object by different camera devices may also be different, so a video may include multiple video formats, such as avi, mp4, mts, and mp3.
  • the video format of the to-be-processed video acquired by the encoding device may be any one of the foregoing video formats.
  • the video can be classified into natural video and screen content video according to the way the video content is generated.
  • Natural video can refer to a video directly shot by a camera or a video camera, that is to say, the natural video includes multiple frames Natural images, for example, daily small videos shot by mobile phones; screen content videos generally refer to content displayed on the screen of an encoding device.
  • the encoding devices mentioned here may mainly include encoding devices such as computers, televisions, and mobile phones.
  • the screen content video includes not only some natural images, but also some computer-generated visual content such as text, graphics, animation, or games.
  • the screen content video is a mixture of natural videos and artificial images. video. For example, a movie, or an animation added to a presentation through the computer.
  • the video type of the to-be-processed video acquired by the encoding device may be any one of natural video or screen content video.
  • videos can be classified into long videos and videos based on the duration of the video content. Short video.
  • the encoding device when the encoding device encodes videos of different video types, if the corresponding pixel precision sets are set for different video types in a targeted manner, you can While improving the quality of encoded video, it also saves the power consumption of some terminals. Therefore, in the embodiment of the present invention, after the video type of the video to be processed is determined, a suitable pixel precision set for the video to be processed is selected, and then the video to be processed is encoded based on the selected pixel precision set. In an embodiment, the encoding device may select a suitable pixel accuracy set for the video to be processed through step S402.
  • the to-be-processed video is composed of multiple frames of images.
  • each frame of the multiple-frame images is processed. Therefore, the following determination Obtaining the video type of the to-be-processed video is essentially determining the video type of the current frame of the to-be-processed video being processed.
  • step S401 may include: determining a hash value corresponding to the video to be processed; if the hash value is not greater than a threshold, determining that the video type of the video to be processed is a pre-processing Set the video type; if the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type. That is, the encoding device can determine the video type to which the video to be processed belongs according to the hash value corresponding to the video to be processed.
  • the threshold is a preset value used to determine the video type, and the value can be set by the encoding device.
  • the preset video type may be set by the encoding device, and the preset video type may include any one or more of screen content video or natural video. In other embodiments, the preset video type It is assumed that the video type may also include any one of a long video or a short video.
  • the encoding device After the encoding device obtains the video to be processed, before starting to encode the target frame of the video to be processed, it first needs to calculate the hash value of the target frame. If the hash value of the target frame is less than or equal to the threshold, the target frame is determined Is the preset video type; if the hash value of the target frame is greater than the threshold, it is determined that the target frame is not the preset video type. Further, according to the result of the judgment, a set of pixel accuracy required for encoding processing is selected.
  • step S401 may further include: calling a video type recognition model to recognize the to-be-processed video to obtain a recognition result; if the video type indicated by the recognition result is a preset video Type, it is determined that the video type of the to-be-processed video is the preset video type. That is to say, the encoding device may store a video type recognition model. The video type recognition mode is obtained through training of video samples containing different video types. The encoding device calls the model to identify the video to be processed and obtains the recognition result.
  • the recognition result may include the probability that the video to be processed belongs to a certain video type, and the video type with a higher probability is determined as the video type of the video to be processed. For example, suppose that the encoding device calls the video type recognition model to recognize the video to be processed, and the obtained recognition result can be 30% of natural video and 70% of screen content video. According to the recognition result, the video to be processed is determined to be screen content video. .
  • Step S402 If the video type of the to-be-processed video is a preset video type, increase the accuracy value of each pixel in the initial pixel accuracy set to obtain the target pixel accuracy set.
  • the initial pixel accuracy set may be the default pixel accuracy set of the encoding device, or may be the pixel accuracy set used by the encoding device in the encoding process of the previous frame.
  • the encoding device may set the initial pixel accuracy set by: determining the initial pixel accuracy set according to the pixel accuracy set used in historical video encoding processing, for example, encoding The device obtains the pixel accuracy set used in the last 5 video processing, and the pixel accuracy set used in the 4 video encoding processing is the same (1/2 pixel accuracy, 1/4 pixel accuracy, integer pixel accuracy), then Determine (1/2 pixel accuracy, 1/4 pixel accuracy, integer pixel accuracy) as the default initial pixel accuracy.
  • the initial pixel accuracy may also be set by the encoding device according to the acquired setting operation.
  • the user Before the encoding device performs encoding processing, the user can set the initial pixel accuracy set during encoding through the user interface of the encoding device. , Or the user can also perform some other coding-related configuration operations through the user interface.
  • the initial pixel accuracy set currently set in the encoding device is acquired, if the initial pixel The accuracy of each pixel included in the accuracy set meets the pixel accuracy requirement when encoding the video of the preset video type, then the initial pixel accuracy set can be used directly to encode the image to be processed; if the initial pixel accuracy Each pixel accuracy included in the set has one or more pixel accuracy, which does not meet the pixel accuracy requirements when encoding the video of the preset video type, then the corresponding pixel accuracy value is modified, and finally the modified The initial pixel accuracy set of is used as the target pixel accuracy set for encoding the to-be-processed video.
  • the modification of the pixel accuracy value may include increase or decrease.
  • the method of modifying each pixel precision value in the initial pixel precision set to obtain the target pixel precision set may be: determining the pixel precision value adjustment rule according to the preset video type;
  • the value adjustment rule modifies each pixel precision value included in the initial pixel precision set to obtain the target pixel precision set.
  • the implementation manner of determining the pixel accuracy value adjustment rule according to the preset video type may be to determine the pixel accuracy value adjustment rule according to the motion rule of the preset video type and the video content of the video to be processed.
  • the pixel accuracy value adjustment rule is: the difference between the modified pixel accuracy value and the corresponding pixel accuracy value before the modification is less than or equal to 7 pixels.
  • the adjustment rule can also be set as: the difference between the modified pixel accuracy and the corresponding pixel accuracy value before the modification is less than or equal to 1/2 pixel accuracy, etc. It should be understood that the foregoing is only a method for modifying the pixel accuracy value listed in the embodiment of the present invention, and the specific modification method is not specifically limited in the embodiment of the present invention.
  • the coding mode includes the first type of coding mode and the second type of coding mode.
  • the first type of coding mode may refer to any one of the inter coding mode and the affine affine coding mode.
  • the second type of coding mode refers to the other of the inter coding mode and the affine coding mode.
  • the initial pixel accuracy set may include a first initial pixel accuracy set and a second initial pixel accuracy set.
  • the initial pixel accuracy set corresponding to the first type of encoding mode may be the first initial pixel accuracy set, and the corresponding initial pixel accuracy set in the second type of encoding mode may be the second initial pixel accuracy set.
  • the initial pixel accuracy set corresponding to the first type of encoding mode may also be the second initial pixel accuracy set, and the initial pixel accuracy set corresponding to the second type of encoding mode may also be the first initial pixel accuracy set.
  • the main difference between the inter coding mode and the affine coding mode is that the inter coding mode only pays attention to the motion information of translational motion in the video, and the affine coding mode pays attention to more motion information, such as zoom, rotation, and perspective. Irregular sports such as sports. From the foregoing description, it can be seen that when using inter coding mode for inter-frame prediction, the processing object of inter-frame prediction is a certain coding block in an image, and the processing object of affine coding mode is no longer the entire coding block, but The entire coding block is divided into multiple coding sub-blocks, and each coding sub-block is used as a processing object.
  • each coding sub-block in affine coding mode corresponds to a motion vector
  • the motion vectors corresponding to multiple coding sub-blocks form the motion vector field in affine coding mode.
  • Motion compensation in affine coding mode refers to the use of motion
  • the vector field and the reference frame get the predicted frame.
  • the motion vector of each coding sub-block included in each coding block in the affine coding mode may be calculated through the parameters of the control points on the coding block.
  • the number of control points on each coding block can be two or three.
  • the video type of the video to be processed is a preset video type
  • the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set
  • the target pixel accuracy set includes a first target pixel accuracy set and a second target pixel accuracy set
  • the initial pixel accuracy set The precision value of each pixel in the precision set is increased to obtain the target pixel precision set, including: acquiring the coding mode used when encoding the video to be processed; if the coding mode is the first type of coding mode, then Each pixel accuracy value in the first initial pixel accuracy set is modified to obtain the first target pixel accuracy set; if the encoding mode is the second type of encoding mode, each pixel accuracy value in the second initial pixel accuracy set is performed Modify to obtain the second target pixel accuracy set.
  • the encoding mode used when encoding the video to be processed may be selected by the encoding device according to the motion information included in the video to be processed, specifically, if the encoding device determines that If the video to be processed includes multiple motion information such as rotation, translation, zooming, etc., the first type of encoding mode can be selected to encode the video to be processed; if the encoding device determines that the video to be processed only includes translation motion information , The second type of encoding mode can be selected to encode the to-be-processed video.
  • the encoding mode used when encoding the to-be-processed video may also be determined by the encoding device according to a setting operation input by the user on the user interface.
  • the first initial pixel accuracy set is (1/2 pixel accuracy, integer pixel accuracy, 1/4 pixel accuracy); in the second encoding mode, the second The initial pixel accuracy set is (1/4 pixel accuracy, 1/8 pixel accuracy, 1/16 pixel accuracy).
  • the encoding device determines the encoding mode required for encoding the to-be-processed video, if it is determined If the coding mode is the first type of coding mode, each pixel precision value in the first initial pixel precision set (integer pixel precision, 1/2 pixel precision, 1/4 pixel precision) is modified to obtain the first target pixel precision The set can be expressed as (integer pixel accuracy, 4 pixel accuracy, and 8 pixel accuracy); if the encoding mode is determined to be the second type of encoding mode, the second initial pixel accuracy set (1/4 pixel accuracy, 1/8 Each pixel precision value in the pixel precision, 1/16 pixel precision) is modified to obtain the second target pixel precision set, which can be expressed as (1/2 pixel precision, integer pixel precision, 2 pixel precision).
  • Step S403 Perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
  • the to-be-processed video is inter-predicted based on the target pixel accuracy set, and then the to-be-processed video is inter-predicted.
  • the performing inter-frame prediction on the to-be-processed video based on the target pixel accuracy set is essentially performing inter-frame prediction on each frame of the to-be-processed video.
  • the encoded video after encoding the processed video to obtain the encoded video, can be transmitted to the decoding end in the form of a bit stream, and the decoding end can decode the encoded video, and the decoding end can decode the encoded video.
  • the obtained video is transmitted to the display device and displayed by the display device.
  • some redundant information included in the video to be processed is eliminated, which greatly reduces the data volume of the video to be processed, improves the video transmission efficiency, and also improves the user The viewing experience. For example, suppose a one-second digital TV video is transmitted through 1M transmission broadband. Without encoding processing, it needs to be transmitted for 9 seconds. That is, the user needs to wait for 9 seconds to watch a one-second video. Digital TV video; after the video processing method of the embodiment of the present invention is used for processing, it may only need to be transmitted for 1 second from time to time.
  • each frame of image is added with identification information that can identify the video type to which the frame of image belongs, and the to-be-processed
  • an encoded video is obtained.
  • the encoded video includes identification information.
  • the encoded video and identification information are sent to the decoding end, so that the decoding end decodes according to the encoding situation of each frame image .
  • the encoding process on the to-be-processed video based on the target pixel accuracy set to obtain the encoded video includes: adding identification information to the encoded video; and sending the encoded video with the identification information added to the decoding At the end, the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
  • the encoding device may set an index mark for each pixel accuracy included in the target pixel accuracy set. In this way, the encoding device can use the index mark You can know the pixel accuracy used when encoding a certain frame of image.
  • the index identification of each pixel accuracy setting included in the target pixel accuracy set may be: assuming that the target pixel accuracy set is (integer pixel accuracy, 4 pixel accuracy, 8 pixel accuracy), the index identification is: 0 means integer pixel accuracy, 00 Represents 4-pixel accuracy, and 01 represents 8-pixel accuracy.
  • the encoding device may send the index identifier and the encoded video together to the decoder, where the index identifier is used to instruct the decoding end to decode the encoded video based on the index identifier.
  • the encoding process of the to-be-processed video based on the target pixel accuracy set to obtain the encoded video includes: setting an index identifier for each pixel accuracy included in the target pixel accuracy set; determining that the to-be-processed video The target pixel precision in the target pixel precision set used in encoding processing, and the index mark corresponding to the target pixel precision; add the index mark to the encoded video, and add the code of the index mark
  • the video is sent to the decoding terminal, and the index identifier is used to instruct the decoding terminal to decode the encoded video based on the index identifier.
  • the encoding device judges the acquired video type of the to-be-processed video, and if the video type of the to-be-processed video is a preset video type, the accuracy value of each pixel in the initial pixel accuracy set is increased , Obtain a target pixel accuracy set, and further, perform encoding processing on the video to be processed based on the target pixel accuracy set to obtain an encoded video.
  • the target pixel accuracy set used in the encoding process is determined according to the video type of the video to be processed, so that targeted pixel accuracy sets are selected for the to-be-processed videos of different video types. Can improve the quality of encoded video.
  • the encoding system may include an encoding terminal 501 and a decoding terminal 502.
  • the encoding terminal 501 and the decoding terminal 502 may be configured in the same terminal device, Or the encoding end 501 and the decoding end 502 may also be two independent devices.
  • the encoding terminal 501 is used to compress and encode the video to be processed using a suitable pixel accuracy set to reduce redundant information included in the video to be processed, and the encoding terminal 501 will compress and encode the video to be processed.
  • the encoded video obtained after the encoding process is sent to the decoding end, and the decoding end 502 uses the pixel precision set corresponding to the encoding end and other encoding information to decode the encoded video.
  • the encoding terminal 501 determines the video type of the video to be processed. According to the video type of the video to be processed, the target pixel accuracy set required for encoding the video to be processed is selected.
  • each pixel accuracy in the initial pixel accuracy set stored in the encoding terminal 501 is modified to obtain the target pixel accuracy set.
  • the to-be-processed video is encoded based on the target pixel accuracy set; if the encoding end 501 determines that the video type of the to-be-processed video is not a preset video type, the initial pixel stored in the decoding end 501 is used for processing.
  • the collection performs encoding processing on the to-be-processed video.
  • the preset video type described in step S602 may include any one or more of screen content video and natural video, or the preset video type may also include any one of long video and short video. Or the preset video type may also be other video types, and the embodiment of the present invention does not limit the preset video type.
  • the method for the encoding terminal 501 to determine the video type of the video to be processed refer to the description of the related content in the embodiment of FIG. 4, which will not be repeated here.
  • the encoding mode of the encoding terminal 501 for encoding the video to be processed can be the first type of encoding mode and the second type of encoding mode.
  • the first type of encoding mode may include any one of the inter encoding mode and the affine encoding mode.
  • the coding mode may include one of inter coding mode and affine coding mode.
  • the initial pixel accuracy sets corresponding to the encoding end 501 are different. Specifically, the initial pixel accuracy set corresponding to the first type of encoding mode may be the first initial pixel accuracy set, and the initial pixel accuracy set corresponding to the second type of encoding mode may be the second initial pixel accuracy set.
  • the encoding end 501 After the encoding end 501 determines that the video type of the video to be processed is the preset video type, before determining the target pixel accuracy set, the encoding end 501 also needs to determine the encoding mode: if the encoding mode is the first type encoding mode, then encode The end 501 adjusts the precision values of each pixel in the first initial pixel accuracy set to obtain the first target pixel accuracy set; if the encoding mode is the second type of encoding mode, the encoding end 501 will adjust each pixel in the second initial pixel accuracy set The precision value is adjusted to obtain the second target pixel precision set.
  • the encoding terminal 501 performs encoding processing on the video to be processed based on the target pixel accuracy set in step S603, and after obtaining the encoded video, it can also add identification information to the encoded video in step S604, and add the identification
  • the encoded video of the information is sent to the decoding terminal 502.
  • the identification information refers to information used to identify that the video type corresponding to the encoded video is a preset video type. In other words, if the decoding terminal 502 detects that the encoded video includes this identification information, it can determine that the encoded video corresponds to the The video type is the preset video type.
  • the decoding terminal 502 receives the encoded video sent by the encoding terminal 501 in step S605, and can extract the information included in the encoded video. If it is determined in step S606 that the encoded video includes identification information, then the received encoded video is determined The corresponding video type is the preset video type; further, the decoding end 502 can obtain the target pixel accuracy set set by the encoding end 501 for the preset video type, and decode the encoded video based on the target pixel accuracy set through step S607 deal with.
  • the encoding terminal 501 after the encoding terminal 501 performs step S603 to obtain the encoded video, in addition to adding identification information to the encoded video, it can also add an index identifier to the encoded video.
  • the index identifier is used to identify the video to be processed by the encoding terminal 501
  • the target pixel accuracy used in the encoding process, and the target pixel accuracy belongs to any one of the target pixel accuracy set. It is understandable that the video to be processed is composed of multiple frames of images. Therefore, the video encoding process of the video to be processed is also processed in units of frames. After encoding each frame of image, it will include each frame in the form of code stream. The encoded video of the frame image encoding result is sent to the decoding end. Therefore, the foregoing target pixel accuracy used when encoding the video to be processed is essentially for each frame of the video to be processed.
  • the encoding end 501 sends the encoded video with the index mark to the decoding end 502, and the decoding end 502 determines the target pixel accuracy identified by the index mark from the target pixel accuracy set;
  • the encoded video is decoded. After the decoding end 502 decodes the encoded video based on the target accuracy, a video with a small amount of data that can be displayed or stored is obtained.
  • the encoding terminal 501 sets the corresponding target pixel accuracy set according to the video type of the video to be processed to encode the video to be processed, which improves the quality of video encoding.
  • the identification information included in the transmitted coded video can accurately select the target pixel accuracy set, and decode the coded video, which can ensure that the video content is not damaged and improve the decoding quality.
  • FIG. 7 it is a schematic structural diagram of an encoding device provided by an embodiment of the present invention.
  • the encoding device as described in FIG. 7 may include: a memory 701 and a processor 702, where the memory 701 and the processor 702 are connected through a bus 703, and the memory Program codes are stored in 701, and the memory 702 calls the program codes in the memory 701.
  • the memory 701 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 701 may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory), solid-state drive (SSD), etc.; the memory 701 may also include a combination of the foregoing types of memories.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • SSD solid-state drive
  • the processor 702 may be a central processing unit (Central Processing Unit, CPU).
  • the processor 702 may further include a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), etc.
  • the PLD may be a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), etc.
  • the processor 702 may also be a combination of the foregoing structures.
  • the memory 701 is used to store a computer program, and the computer program includes program instructions.
  • the processor 702 is used to execute the program instructions stored in the memory 701 to implement the above-mentioned embodiment shown in FIG. 4 The steps of the corresponding method.
  • the processor 702 is configured to execute when the program instructions are called: determine the acquired video type of the to-be-processed video; if the video type of the to-be-processed video is a preset video type, the initial Each pixel precision value in the pixel precision set is modified to obtain a target pixel precision set; the to-be-processed video is encoded based on the target pixel precision set to obtain an encoded video.
  • the processor 702 when determining the video type of the acquired video to be processed, performs the following operations: determining the hash value corresponding to the video to be processed; if the hash value is not greater than the threshold, It is determined that the video type of the video to be processed is a preset video type; if the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type.
  • the processor 702 determines the acquired video type of the to-be-processed video, it performs the following operations: calling the video type recognition model to recognize the to-be-processed video and obtain the recognition result; if the recognition is The result indicates that the video type is the preset video type, and it is determined that the video type of the to-be-processed video is the preset video type.
  • the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set
  • the target pixel accuracy set includes a first target pixel accuracy set and a second target pixel accuracy set.
  • the first type of coding mode may include any one of an inter coding mode and an affine coding mode
  • the second type of coding mode may include an inter coding mode and the affine coding mode. Another encoding mode.
  • the processor 702 when the processor 702 encodes the to-be-processed video based on the target pixel accuracy set to obtain an encoded video, it performs the following operations: add identification information to the encoded video; add The encoded video of the identification information is sent to the decoding end, and the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
  • the processor 702 when the processor 702 encodes the to-be-processed video based on the target pixel accuracy set to obtain an encoded video, it performs the following operations: set the accuracy of each pixel included in the target pixel accuracy set Index identification; determining the target pixel accuracy in the target pixel accuracy set used when encoding the video to be processed, and the index identification corresponding to the target pixel accuracy; adding the index identification to the encoded video, And send the coded video to which the index identifier is added to the decoding terminal, where the index identifier is used to instruct the decoding terminal to decode the coded video based on the index identifier.
  • FIG. 8 it is a schematic structural diagram of a decoding device provided by an embodiment of the present invention.
  • the decoding device described in FIG. 8 may include: a memory 801 and a processor 802.
  • the memory 801 and the processor 802 are connected through a bus 803.
  • Program codes are stored in 801, and the memory 802 calls the program codes in the memory 801.
  • the memory 801 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 801 may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory), solid-state drive (solid-state drive, SSD), etc.; the memory 801 may also include a combination of the foregoing types of memories.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • solid-state drive solid-state drive
  • SSD solid-state drive
  • the processor 802 may be a central processing unit (Central Processing Unit, CPU).
  • the processor 802 may further include a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), etc.
  • the PLD may be a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), etc.
  • the processor 802 may also be a combination of the foregoing structures.
  • the memory 801 is used to store a computer program, and the computer program includes program instructions, and the processor 802 is used to execute the program instructions stored in the memory 801.
  • the processor 802 is configured to call the program instructions to execute: receive an encoded video; when the encoded video includes identification information, determine that the video type corresponding to the encoded video is a preset video type The coded video is decoded based on the target pixel accuracy set; the target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  • the coded video further includes an index identifier
  • the processor 802 performs the following operations when decoding the coded video based on the target pixel accuracy set: From the target pixel accuracy set, Determine the target pixel accuracy identified by the index identifier; and perform decoding processing on the encoded video based on the target pixel accuracy.
  • the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video processing method and device, an encoding apparatus, and a decoding apparatus. The method can comprise: determining a video type of an acquired video to be processed; if the video type of said video is a preset video type, modifying respective pixel precision values in an initial pixel precision set to obtain a target pixel precision set; and performing encoding processing on said video on the basis of the target pixel precision set to obtain an encoded video. The embodiments of the present invention can enhance encoding performance of a terminal apparatus.

Description

一种视频处理方法、装置、编码设备及解码设备Video processing method, device, coding equipment and decoding equipment 技术领域Technical field
本发明涉及通信技术领域,尤其涉及一种视频处理方法、装置、编码设备及解码设备。The present invention relates to the field of communication technology, and in particular to a video processing method, device, encoding device and decoding device.
背景技术Background technique
随着信息时代的不断发展,在日常生活中,越来越多的用户会通过照相机或者摄像机拍摄视频的方式来记录或者存储某些内容,由于拍摄得到视频的数据量较大,终端设备在存储或者传输视频时,需要将视频内容进行编码处理,然后将编码后的视频进行存储或者传输。在需要显示视频时,通过与编码处理时相对应的解码方式对编码后的视频进行解码并显示。With the continuous development of the information age, in daily life, more and more users will record or store certain content by shooting video with a camera or video camera. Due to the large amount of video data obtained by shooting, the terminal device is storing Or when transmitting a video, the video content needs to be encoded, and then the encoded video is stored or transmitted. When the video needs to be displayed, the encoded video is decoded and displayed through the decoding method corresponding to the encoding process.
在对视频进行编码处理过程中,关键技术之一是帧间预测。帧间预测的主要思想是通过视频中当前帧的运动矢量和参考帧得到预测帧,在此过程中,运动矢量的像素精度选择直接关系到帧间预测的质量,进而也影响了视频编码质量。因此,在数字视频编码技术领域中,如何选择编码处理时的像素精度成为当今研究的热点问题。In the process of encoding video, one of the key technologies is inter-frame prediction. The main idea of inter-frame prediction is to obtain the predicted frame from the motion vector of the current frame and the reference frame in the video. In this process, the selection of the pixel accuracy of the motion vector is directly related to the quality of inter-frame prediction, which in turn affects the quality of video coding. Therefore, in the field of digital video coding technology, how to choose the pixel accuracy during coding processing has become a hot issue in current research.
发明内容Summary of the invention
本发明实施例提供了一种视频处理方法、装置、编码设备及解码设备,可以提高终端设备的编码性能。The embodiments of the present invention provide a video processing method, device, encoding device and decoding device, which can improve the encoding performance of terminal equipment.
第一方面,本发明实施例提供了一种视频处理方法,包括:In the first aspect, an embodiment of the present invention provides a video processing method, including:
确定获取到的待处理视频的视频类型;Determine the video type of the acquired video to be processed;
如果所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;If the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。Perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
第二方面,本发明实施例提供了另一种视频处理方法,包括:In the second aspect, embodiments of the present invention provide another video processing method, including:
接收编码视频;Receive encoded video;
当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为 预设视频类型;When the coded video includes identification information, determining that the video type corresponding to the coded video is a preset video type;
基于目标像素精度集合对所述编码视频进行解码处理;Decoding the encoded video based on the target pixel accuracy set;
所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。The target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
第三方面,本发明实施例提供了一种视频处理装置,包括确定单元和处理单元:In a third aspect, an embodiment of the present invention provides a video processing device, including a determining unit and a processing unit:
确定单元,用于确定获取到的待处理视频的视频类型;The determining unit is used to determine the video type of the acquired video to be processed;
处理单元,用于如果确定单元确定出所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;A processing unit, configured to, if the determining unit determines that the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
所述处理单元,还用于基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。The processing unit is further configured to perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
第四方面,本发明实施例还提供了另一种视频处理装置,包括接收单元和处理单元:In a fourth aspect, an embodiment of the present invention also provides another video processing device, including a receiving unit and a processing unit:
接收单元,用于接收编码视频;Receiving unit for receiving encoded video;
处理单元,用于当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为预设视频类型;A processing unit, configured to determine that the video type corresponding to the coded video is a preset video type when the coded video includes identification information;
所述处理单元,还用于基于目标像素精度集合对所述编码视频进行解码处理;The processing unit is further configured to decode the encoded video based on the target pixel accuracy set;
所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。The target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
第五方面,本发明实施例提供了一种编码设备,其特征在于,包括存储器和处理器,所述存储器和所述处理器相连接,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第一方面的视频处理方法。In a fifth aspect, an embodiment of the present invention provides an encoding device, which is characterized by comprising a memory and a processor, the memory is connected to the processor, the memory is used to store a computer program, and the computer program includes Program instructions, the processor is configured to call the program instructions to execute the video processing method of the first aspect described above.
第六方面,本发明实施例提供了一种解码设备,其特征在于,包括存储器和处理器,所述存储器和所述处理器相连接,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第二方面的视频处理方法。In a sixth aspect, an embodiment of the present invention provides a decoding device, which is characterized in that it includes a memory and a processor, the memory is connected to the processor, the memory is used to store a computer program, and the computer program includes Program instructions, the processor is configured to call the program instructions to execute the video processing method of the second aspect described above.
第七方面,本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有第一计算机程序指令,所述第一计算机程序指令被处理器执行时,用于执行第一方面的视频处理方法;所述计算机存储介质中还存储有第二计算机程序指令,所述第二计算机程序指令被处理器执行时,用于执行第二方面的视频处理方法。In a seventh aspect, an embodiment of the present invention also provides a computer storage medium, in which a first computer program instruction is stored, and when the first computer program instruction is executed by a processor, it is used to execute the first aspect. The video processing method; the computer storage medium also stores a second computer program instruction, when the second computer program instruction is executed by the processor, it is used to execute the video processing method of the second aspect.
本发明实施例中,终端设备对获取到的待处理视频的视频类型进行判断,如果所述待处理视频的视频类型为预设视频类型时,则将初始像素精度集合中各个像素精度值增大,得到目标像素精度集合,进一步的,基于所述目标像素精度集合对待处理视频进行编码处理,得到编码视频。上述对待处理视频进行编码处理的过程中,根据待处理视频的视频类型确定编码处理时使用的目标像素精度集合,实现了有针对性的为不同视频类型的待处理视频选择目标像素精度集合,从而可提高编码视频的质量。In the embodiment of the present invention, the terminal device judges the acquired video type of the video to be processed, and if the video type of the video to be processed is a preset video type, the accuracy value of each pixel in the initial pixel accuracy set is increased , Obtain a target pixel accuracy set, and further, perform encoding processing on the video to be processed based on the target pixel accuracy set to obtain an encoded video. In the above process of encoding the video to be processed, the target pixel accuracy set used in the encoding process is determined according to the video type of the video to be processed, so that targeted pixel accuracy sets are selected for the to-be-processed videos of different video types. Can improve the quality of encoded video.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some of the present invention. Embodiments, for those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为本发明实施例提供的一种无人机航拍的场景图;FIG. 1 is a scene diagram of drone aerial photography provided by an embodiment of the present invention;
图2a为本发明实施例提供的一种运动估计的示意图;Figure 2a is a schematic diagram of a motion estimation provided by an embodiment of the present invention;
图2b为本发明实施例提供的一种确定运动矢量的示意图;Figure 2b is a schematic diagram of determining a motion vector provided by an embodiment of the present invention;
图3a为本发明实施例提供的另一种运动估计的示意图;Figure 3a is a schematic diagram of another motion estimation provided by an embodiment of the present invention;
图3b为本发明实施例提供的又一种运动估计的示意图;Figure 3b is a schematic diagram of yet another motion estimation provided by an embodiment of the present invention;
图4为本发明实施例提供的一种视频处理方法的流程示意图;4 is a schematic flowchart of a video processing method provided by an embodiment of the present invention;
图5为本发明实施例提供的一种编码系统的示意图;Figure 5 is a schematic diagram of an encoding system provided by an embodiment of the present invention;
图6为本发明实施例提供的一种交互图;Figure 6 is an interaction diagram provided by an embodiment of the present invention;
图7为本发明实施例提供的一种编码设备的结构示意图;Figure 7 is a schematic structural diagram of an encoding device provided by an embodiment of the present invention;
图8为本发明实施例提供的一种解码设备的结构示意图。Fig. 8 is a schematic structural diagram of a decoding device provided by an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
本发明实施例针对视频编码中像素精度选择问题,提出了一种视频处理方法,所述方法可以根据待处理视频的视频类型,有针对性的设置像素精度集合以进行编码处理,可以提高视频编码的性能。具体地,本发明实施例提供的视频处理方法可包括:确定获取到的待处理视频的视频类型;如果所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。上述对待处理视频进行编码处理的过程中,根据待处理视频的视频类型,对初始像素精度集合进行相应的修改,以得到适用于所述待处理视频的像素精度集合,实现了有针对性的为不同视频类型的待处理视频选择目标像素精度集合,从而可提高编码视频的质量。The embodiment of the present invention proposes a video processing method for the pixel precision selection problem in video encoding. The method can set the pixel precision set for encoding processing according to the video type of the video to be processed, and can improve the video encoding. Performance. Specifically, the video processing method provided by the embodiment of the present invention may include: determining the acquired video type of the to-be-processed video; if the video type of the to-be-processed video is a preset video type, adding each pixel in the initial pixel accuracy set The accuracy value is modified to obtain a target pixel accuracy set; the to-be-processed video is encoded based on the target pixel accuracy set to obtain an encoded video. In the foregoing process of encoding the video to be processed, the initial pixel accuracy set is modified accordingly according to the video type of the video to be processed to obtain a pixel accuracy set suitable for the video to be processed, which achieves targeted behavior The target pixel precision set of the to-be-processed videos of different video types can be selected to improve the quality of the encoded video.
本发明实施例提供的视频处理方法可以应用各种视频编码传输的应用场景中,下面以应用在无人机航拍的应用场景中为例具体介绍所述视频处理方法。参考图1,为本发明实施例提供的一种无人机航拍的场景图,假设图1中包括无人机101,摄像区域102,显示设备103。其中,无人机101上挂载有摄像装置1011,所述摄像装置可以用于拍摄视频和拍摄图像,所述无人机101上还可以配置有云台1012,所述摄像装置1011可以通过云台1012挂载于无人机101上。摄像区域102中包括车辆、树木以及河流,所述摄像装置1011对摄像区域102进行拍摄,得到待处理视频。The video processing method provided by the embodiment of the present invention can be applied to various application scenarios of video encoding transmission. The following takes the application scenario of drone aerial photography as an example to specifically introduce the video processing method. Referring to FIG. 1, which is a scene diagram of drone aerial photography provided by an embodiment of the present invention, it is assumed that FIG. 1 includes a drone 101, a camera area 102, and a display device 103. Among them, a camera device 1011 is mounted on the drone 101, and the camera device can be used to shoot videos and images. The drone 101 can also be equipped with a pan/tilt 1012, and the camera device 1011 can pass through the cloud. The station 1012 is mounted on the drone 101. The camera area 102 includes vehicles, trees, and rivers. The camera device 1011 captures the camera area 102 to obtain a video to be processed.
无人机101中可默认设置了初始像素精度集合,该初始像素精度集合中各个像素精度可能不是对所有视频类型的视频都适用,因此在获取到待处理视频之后,无人机101不是直接使用初始像素精度集合对所述待处理视频进行编码处理,而是判断所述待处理视频的视频类型,进一步确定所述待处理视频所属的视频类型是否适合使用初始像素精度集合进行编码。如果判断出所述待处理视频所属的视频类型可以使用初始像素精度集合进行编码,则基于所述初始像素精度集合对所述待处理视频进行编码处理;如果判断出所述待处理视频所属 的视频类型不适合使用初始像素精度集合,则将初始像素精度集合中包括的各个像素精度值修改,以使得修改后的各个像素精度值适用于所述待处理视频所属的视频类型,各个修改后的像素精度值组成了目标像素精度集合。接下来,无人机101基于目标像素精度集合对待处理视频进行编码处理。The initial pixel accuracy set can be set in the drone 101 by default. Each pixel accuracy in the initial pixel precision set may not be applicable to all video types. Therefore, after obtaining the video to be processed, the drone 101 is not used directly The initial pixel accuracy set encodes the video to be processed, but determines the video type of the video to be processed, and further determines whether the video type to which the video to be processed belongs is suitable for encoding using the initial pixel accuracy set. If it is determined that the video type to which the to-be-processed video belongs can be encoded using the initial pixel accuracy set, then the to-be-processed video is encoded based on the initial pixel accuracy set; if it is determined that the video to be processed belongs to If the type is not suitable for using the initial pixel accuracy set, each pixel accuracy value included in the initial pixel accuracy set is modified so that each modified pixel accuracy value is applicable to the video type to which the video to be processed belongs, and each modified pixel The precision value constitutes the target pixel precision set. Next, the drone 101 encodes the video to be processed based on the target pixel accuracy set.
可选的,无人机101将对待处理视频进行编码处理后得到的编码视频发送给解码端,此处所述的解码端可以配置于无人机101中,也可以为独立于无人机的解码设备,解码端采用相应的解码策略对编码视频进行解码,最后将解码后的视频发送给显示设备103,显示设备101可以为带有显示屏的编码设备,在接收到解码端发送的解码后的视频时,可以将解码后的视频显示在显示屏上,以使得用户可以观看视频。Optionally, the UAV 101 sends the encoded video obtained by encoding the video to be processed to the decoding end. The decoding end described here may be configured in the UAV 101 or may be independent of the UAV 101. The decoding device, the decoding end uses the corresponding decoding strategy to decode the encoded video, and finally sends the decoded video to the display device 103. The display device 101 can be an encoding device with a display screen. After receiving the decoding sent by the decoding end When the video is displayed, the decoded video can be displayed on the display screen so that the user can watch the video.
在一个实施例中,视频是指将一系列静态影像以电信号的方式加以捕捉、纪录、处理、储存、传送与重现的各种技术,通过照相机或者摄像机等拍摄装置拍摄得到的原始视频中包含大量的冗余信息,所以未经过压缩的视频数据量非常大,存储起来比较困难,同时也不便于在网络中进行传输。例如,一秒钟的数字电视视频的数据量约为1113KB,如果假设传输带宽为1M,比特率为9123840,则传输一秒钟的数字电视视频需要9秒钟,也即用户想要观看一秒钟的数字电视视频需要等到9秒钟,大大降低了用户体验。再如,一段未经压缩的10秒钟的视频的数据量大约是2.4G,假设一个内存为16G的手机,除去系统占用的部分,剩下的存储空间最多是12G,最多只能存储50秒视频。In one embodiment, video refers to various technologies that capture, record, process, store, transmit, and reproduce a series of static images in the form of electrical signals. The original video captured by a camera or video camera or other shooting device Contains a lot of redundant information, so the amount of uncompressed video data is very large, it is difficult to store, and it is not convenient to transmit on the network. For example, the data volume of one second of digital TV video is about 1113KB. If the transmission bandwidth is 1M and the bit rate is 9123840, it takes 9 seconds to transmit one second of digital TV video, that is, the user wants to watch one second Zhong's digital TV video needs to wait up to 9 seconds, which greatly reduces the user experience. For another example, the data volume of an uncompressed 10-second video is about 2.4G. Assuming a mobile phone with 16G memory, excluding the part occupied by the system, the remaining storage space is at most 12G and can only store at most 50 seconds. video.
因此,为了解决视频存储难、传输难的问题,需要对原始视频进行压缩处理。所谓的对原始视频进行压缩处理是为了除去原始视频中包含的大量的冗余信息,例如时间冗余、视觉冗余以及空间冗余等,所述对原始视频进行压缩处理的过程实质上是对视频进行编码的过程。在本发明实施例中,所述待处理视频即为原始视频,视频编码过程中的关键技术之一是帧间预测,所述帧间预测技术是利用视频相邻帧之间的时域相关性,使用之前已经编码的重构帧作为参考帧,通过运动估计和运动补偿的方法对当前帧进行预测,从而除去视频的时间冗余信息。简单来说,帧间预测的理论基础就是活动图像临近帧中的景物存在着一定的相关性,在编码时不需要传递每一帧的所有信息,而只需要传递帧与帧之间的差值即可。Therefore, in order to solve the problem of difficulty in video storage and transmission, it is necessary to compress the original video. The so-called compression processing on the original video is to remove a large amount of redundant information contained in the original video, such as temporal redundancy, visual redundancy, and spatial redundancy. The process of compressing the original video is essentially The process of video encoding. In the embodiment of the present invention, the video to be processed is the original video, and one of the key technologies in the video encoding process is inter-frame prediction. The inter-frame prediction technique utilizes the temporal correlation between adjacent frames of the video. , Use the previously coded reconstructed frame as the reference frame, and predict the current frame through motion estimation and motion compensation, thereby removing the temporal redundant information of the video. In simple terms, the theoretical basis of inter-frame prediction is that there is a certain correlation between the scenes in the adjacent frames of the moving image. When encoding, it is not necessary to transmit all the information of each frame, but only the difference between frames. OK.
视频可以看做是由多帧图像组成的,对视频进行编码处理是指对视频包括的每一帧图像进行编码。在一个实施例中,对视频中的任一帧图像进行编码时,首先将该帧图像划分成多个编码区域,进一步再将每个编码区域划分成多个编码单元,每个编码单元包括多个编码块,对每个编码块依次进行帧间预测。下面以当前帧的某个编码单元中的目标编码块为例,介绍帧间预测的过程:在时域中找到当前帧对应的参考帧,所述参考帧为时域中当前帧附近的已编码帧中的任意一帧;在参考帧中搜索与目标编码块相似的相似块,确定目标编码块与相似块之间的相对位置(如图2a和图2b所示),所述相对位置称为运动矢量(Motion Vector,MV)(为了方便描述,以下将确定运动矢量的过程称为运动估计);根据运动矢量,以及运动矢量的相关信息和参考帧得到目标编码块对应的预测块,通过上述相似的过程可以得到当前帧的每个编码块对应的预测块,从而便可得到当前帧的预测帧。Video can be regarded as composed of multiple frames of images. Encoding the video refers to encoding each frame of image included in the video. In one embodiment, when encoding any frame image in the video, the frame image is first divided into multiple coding regions, and then each coding region is divided into multiple coding units, and each coding unit includes multiple coding units. For each coding block, perform inter-frame prediction in turn. The following takes the target coding block in a certain coding unit of the current frame as an example to introduce the process of inter-frame prediction: find the reference frame corresponding to the current frame in the time domain, and the reference frame is the coded near the current frame in the time domain Any frame in the frame; search for similar blocks similar to the target coded block in the reference frame, and determine the relative position between the target coded block and the similar block (as shown in Figure 2a and Figure 2b). The relative position is called Motion Vector (MV) (for the convenience of description, the process of determining the motion vector is called motion estimation below); according to the motion vector, the related information of the motion vector and the reference frame, the prediction block corresponding to the target encoding block is obtained. A similar process can obtain the prediction block corresponding to each coding block of the current frame, so that the prediction frame of the current frame can be obtained.
在一个实施例中,上述运动矢量的相关信息包括在运动估计过程中所使用的像素精度(也可以理解为运动矢量的像素精度)、运动矢量残差(Motion vector difference,MVD)等。其中,MVD是指通过运动估计过程得到的运动矢量与预测运动矢量(motion vector prediction,MVP)之间的差异,所述MVP是利用多个临近的已编码块与当前编码块之间的多个MV计算得到的。像素精度值越大,表示像素精度的精确度越低,运动估计准确度越低,像素精度值越小,表示像素精度的精确度越高,运动估计准确度越高。举例来说,参考图3a和图3b为本发明实施例提供的两种运动估计的示意图,在上述两图中,黑色点表示整像素点,白色点表示1/2像素点,假设301表示当前帧中的一个目标编码块,302表示参考帧中的相似块。假设图3a使用的像素精度为整像素精度,箭头303表示目标编码块对应的运动矢量,也就是目标编码块在前一帧图像中和在当前帧图像中的位置差异。假设在图3b使用的像素精度为1/2像素精,304表示目标编码块对应的运动矢量,由图3a和图3b对比可知,304表示的运动矢量比303表示的运动矢量更精准。In an embodiment, the above-mentioned related information of the motion vector includes the pixel accuracy (also can be understood as the pixel accuracy of the motion vector) used in the motion estimation process, the motion vector difference (MVD), etc. Among them, MVD refers to the difference between a motion vector obtained through a motion estimation process and a motion vector prediction (MVP). The MVP uses multiple adjacent coded blocks and multiple current coded blocks. MV is calculated. The larger the pixel accuracy value, the lower the accuracy of the pixel accuracy, the lower the accuracy of the motion estimation, the smaller the pixel accuracy value, the higher the accuracy of the pixel accuracy, and the higher the accuracy of motion estimation. For example, referring to Figures 3a and 3b are schematic diagrams of two kinds of motion estimation provided by the embodiments of the present invention. In the above two figures, the black dots represent the whole pixel, the white dots represent 1/2 pixel, assuming that 301 represents the current A target coding block in the frame, 302 represents a similar block in the reference frame. Assuming that the pixel precision used in FIG. 3a is the integer pixel precision, the arrow 303 represents the motion vector corresponding to the target coding block, that is, the position difference between the target coding block in the previous frame image and the current frame image. Assuming that the pixel precision used in Figure 3b is 1/2 pixel precision, 304 represents the motion vector corresponding to the target coding block. From the comparison between Figure 3a and Figure 3b, it can be seen that the motion vector represented by 304 is more accurate than the motion vector represented by 303.
考虑到人的视觉系统对某些细节不敏感,一段视频中可能某些运动细节使用低像素精度进行编码处理即可,还有一些运动细节需要使用高像素精度进行编码处理的,基于这种情况,为了提高帧间预测质量,编码设备可采用自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)技术,确定帧间 预测过程中所使用的运动矢量的像素精度,通过以上描述可知AMVR技术实质上决定的是MVD的像素精度。在一个实施例中,所述AMVR技术的主要原理是:编码设备可以设置一个像素精度集合,该像素精度集合中可包括至少两个像素精度,在对某段视频的某个编码单元进行编码处理时,可以根据该编码单元的特点自适应地从所述像素精度集合中选择相应的像素精度,作为MVD的像素精度。Considering that the human visual system is not sensitive to some details, some motion details in a video may be encoded with low pixel accuracy, and some motion details need to be encoded with high pixel accuracy. Based on this situation In order to improve the quality of inter-frame prediction, the encoding device can use Adaptive Motion Vector Resolution (AMVR) technology to determine the pixel accuracy of the motion vector used in the inter-frame prediction process. The above description shows that AMVR technology is essentially What determines the pixel accuracy of MVD. In one embodiment, the main principle of the AMVR technology is: the encoding device can set a set of pixel precisions, and the set of pixel precisions can include at least two pixel precisions. When encoding a certain coding unit of a certain video, At this time, the corresponding pixel precision can be selected adaptively from the pixel precision set according to the characteristics of the coding unit as the pixel precision of the MVD.
应当理解的,编码设备设置一个像素精度集合,在对视频进行编码时,从像素精度集合中为每个编码单元选择合适的像素精度进行编码,这样能保证在去除视频中视觉冗余的同时,也减少了编码设备处理的数据量,节省了部分终端功耗。例如,像素精度集合可以为(整像素精度,1/2像素精度,1/4像素精度),或者像素精度集合还可以为(整像素精度、4像素精度和1/4像素精度)。It should be understood that the encoding device sets a pixel accuracy set, and when encoding a video, selects an appropriate pixel accuracy for each coding unit from the pixel accuracy set for encoding, so as to ensure that while removing visual redundancy in the video, It also reduces the amount of data processed by the encoding device and saves some terminal power consumption. For example, the pixel accuracy set may be (integer pixel accuracy, 1/2 pixel accuracy, 1/4 pixel accuracy), or the pixel accuracy set may also be (integer pixel accuracy, 4 pixel accuracy, and 1/4 pixel accuracy).
在一个实施例中,不同视频类型下的视频内容的特点不同,视频内容的特点不同,导致在对利用AMVR技术进行视频编码时所使用的像素精度集合也有所不同。通常情况下,视频类型可包括自然视频和屏幕内容视频,自然视频是指通过摄像装置对某些场景进行拍摄得到的,未经过其他处理的视频;屏幕内容视频一般指编码设备的屏幕上显示的内容,主要有计算机屏幕、电视屏幕、手机屏幕等等内容的视频。这类视频不仅包括一些自然图像,还包括一些文本、图形、动画、游戏等由计算机产生的视觉内容,属于自然和人造图像混合形成的一种视频。和自然视频相比,屏幕内容视频往往具有陡峭的边缘、高纯的色彩、强烈的对比等,也往往有更加规律性,更加简单的运动信息。In an embodiment, the characteristics of the video content under different video types are different, and the characteristics of the video content are different, resulting in different sets of pixel accuracy used when encoding the video using the AMVR technology. Generally, video types can include natural video and screen content video. Natural video refers to a video that is obtained by shooting certain scenes through a camera device without other processing; screen content video generally refers to the video displayed on the screen of an encoding device Content, mainly including computer screens, TV screens, mobile phone screens and other content video. This type of video includes not only some natural images, but also some visual content generated by computers such as text, graphics, animation, and games. It is a kind of video formed by a mixture of natural and artificial images. Compared with natural videos, screen content videos often have steep edges, high-purity colors, strong contrasts, etc., as well as more regular and simple motion information.
由于自然视频和屏幕内容视频之间的差别,该像素精度集合中的几个像素精度可能不适合应用于对屏幕内容视频或自然视频进行编码处理,因此,本申请将编码设备设置的上述像素精度集合(例如上所述,整像素精度、1/2像素精度和1/4像素精度)称为初始像素精度集合,在对屏幕内容视频进行编码时,将对初始像素精度集合中各个像素精度值进行修改,得到适用于屏幕内容视频的目标像素精度集合。同理的,编码设备设置的初始像素精度集合可能适用于屏幕内容视频,若待处理视频为自然视频时,同样也需要对初始像素精度集合中各个像素精度进行修改,得到适用于自然视频的目标像素精度集合。Due to the difference between natural video and screen content video, several pixel precisions in this pixel precision set may not be suitable for encoding screen content video or natural video. Therefore, this application sets the above pixel precision set by the encoding device The set (for example, the integer pixel precision, 1/2 pixel precision and 1/4 pixel precision as mentioned above) is called the initial pixel precision set. When the screen content video is encoded, the precision value of each pixel in the initial pixel precision set will be calculated Make modifications to obtain the target pixel accuracy set suitable for the screen content video. Similarly, the initial pixel accuracy set set by the encoding device may be suitable for screen content videos. If the video to be processed is a natural video, it is also necessary to modify the accuracy of each pixel in the initial pixel accuracy set to obtain a target suitable for natural video. Pixel accuracy collection.
请参见图4,为本发明实施例提供的一种视频处理方法,所述视频处理方 法可用于任何能够实现编码功能的编码设备中,所述视频处理方法可具体由编码设备的处理器执行。所述视频处理方法可包括以下步骤:Refer to FIG. 4, which is a video processing method provided by an embodiment of the present invention. The video processing method can be used in any encoding device capable of implementing an encoding function, and the video processing method can be specifically executed by a processor of the encoding device. The video processing method may include the following steps:
步骤S401、编码设备确定获取到的待处理视频的视频类型。Step S401: The encoding device determines the video type of the acquired video to be processed.
在一个实施例中,所述编码设备获取到的待处理视频可以是通过编码设备上配置的摄像装置对摄像对象进行拍摄得到的,或者所述待处理视频也可以是一个独立的摄像设备对摄像对象拍摄得到并发送给编码设备的。In an embodiment, the to-be-processed video acquired by the encoding device may be obtained by photographing the camera object through a camera device configured on the encoding device, or the to-be-processed video may also be an independent camera device that captures the camera. The subject is photographed and sent to the encoding device.
在一个实施例中,通过不同摄像设备对摄像对象进行拍摄所得的视频格式也可以不同,因此一段视频可以包括多种视频格式,例如avi,mp4,mts以及mp3等。可选的,所述编码设备获取到的所述待处理视频的视格式可以为上述视频格式中的任意一种。In one embodiment, the video formats obtained by shooting the camera object by different camera devices may also be different, so a video may include multiple video formats, such as avi, mp4, mts, and mp3. Optionally, the video format of the to-be-processed video acquired by the encoding device may be any one of the foregoing video formats.
作为一种可行的实施方式,可以根据视频内容的产生方式将视频可以分类为自然视频和屏幕内容视频,自然视频可以指通过相机或者摄像机直接拍摄得到的视频,也就是说自然视频中包括多帧自然图像,例如,日常手机拍摄的小视频;屏幕内容视频一般是指编码设备的屏幕上显示的内容,此处所述的编码设备主要可包括计算机、电视机以及手机等编码设备。具体来讲,所述屏幕内容视频中不仅包括一些自然图像,还包括一些文本、图形、动画或者游戏等由计算机产生的视觉内容,所述屏幕内容视频是自然视频和人造图像混合形成的一种视频。例如,电影,或者通过计算机为一段演示文稿添加的播放动画等。本发明实施例中,所述编码设备获取到的待处理视频的视频类型可以为自然视频或者屏幕内容视频中的任意一种。As a feasible implementation, the video can be classified into natural video and screen content video according to the way the video content is generated. Natural video can refer to a video directly shot by a camera or a video camera, that is to say, the natural video includes multiple frames Natural images, for example, daily small videos shot by mobile phones; screen content videos generally refer to content displayed on the screen of an encoding device. The encoding devices mentioned here may mainly include encoding devices such as computers, televisions, and mobile phones. Specifically, the screen content video includes not only some natural images, but also some computer-generated visual content such as text, graphics, animation, or games. The screen content video is a mixture of natural videos and artificial images. video. For example, a movie, or an animation added to a presentation through the computer. In the embodiment of the present invention, the video type of the to-be-processed video acquired by the encoding device may be any one of natural video or screen content video.
应当理解的,上述对视频分类只是本发明实施例列举的一种可行的视频分类方法,还可以通过其他分类依据为视频类型进行分类,比如,可以根据视频内容的时长将视频分为长视频和短视频。It should be understood that the above classification of videos is only a feasible video classification method listed in the embodiment of the present invention. Other classification criteria can also be used to classify video types. For example, videos can be classified into long videos and videos based on the duration of the video content. Short video.
通过图2a、图2b以及图3a,图3b部分的实施例描述可知,编码设备在对不同视频类型的视频进行编码时,如果有针对性的为不同视频类型设置相应的像素精度集合,可以在提高编码视频质量的同时,节省部分终端的功耗开销。因此,本发明实施例在确定了待处理视频的视频类型之后,为待处理视频选择合适的像素精度集合,然后基于选择的所述像素精度集合对所述待处理视频进行编码处理。在一个实施例中,所述编码设备可通过步骤S402实现为待处理视频选择合适的像素精度集合。From the description of the embodiments in Figure 2a, Figure 2b, and Figure 3a and Figure 3b, it can be seen that when the encoding device encodes videos of different video types, if the corresponding pixel precision sets are set for different video types in a targeted manner, you can While improving the quality of encoded video, it also saves the power consumption of some terminals. Therefore, in the embodiment of the present invention, after the video type of the video to be processed is determined, a suitable pixel precision set for the video to be processed is selected, and then the video to be processed is encoded based on the selected pixel precision set. In an embodiment, the encoding device may select a suitable pixel accuracy set for the video to be processed through step S402.
应当理解的,所述待处理视频是由多帧图像组成的,在对所述待处理视频进行编码处理时,是对多帧图像中每一帧图像进行处理的,因此,以下所述的确定获取到待处理视频的视频类型,实质上是确定正在处理的待处理视频的当前帧的视频类型。It should be understood that the to-be-processed video is composed of multiple frames of images. When the to-be-processed video is encoded, each frame of the multiple-frame images is processed. Therefore, the following determination Obtaining the video type of the to-be-processed video is essentially determining the video type of the current frame of the to-be-processed video being processed.
在一个实施例中,所述步骤S401的实现方式可以包括:确定所述待处理视频对应的哈希值;若所述哈希值不大于阈值,则确定所述待处理视频的视频类型为预设视频类型;若所述哈希值大于所述阈值,则确定所述待处理视频的视频类型不是所述预设视频类型。也就是说,编码设备可以根据待处理视频对应的哈希值确定待处理视频所属的视频类型。其中,所述阈值是用于判定视频类型的一个预设值,该值可以由编码设备设置。在本发明实施例中,预设视频类型可以是编码设备设置的,所述预设视频类型可以包括屏幕内容视频或者自然视频中的任意一种或多种,在其他实施例中,所述预设视频类型也可以包括长视频或者短视频中的任意一种。In an embodiment, the implementation of step S401 may include: determining a hash value corresponding to the video to be processed; if the hash value is not greater than a threshold, determining that the video type of the video to be processed is a pre-processing Set the video type; if the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type. That is, the encoding device can determine the video type to which the video to be processed belongs according to the hash value corresponding to the video to be processed. Wherein, the threshold is a preset value used to determine the video type, and the value can be set by the encoding device. In the embodiment of the present invention, the preset video type may be set by the encoding device, and the preset video type may include any one or more of screen content video or natural video. In other embodiments, the preset video type It is assumed that the video type may also include any one of a long video or a short video.
假设编码设备获取到待处理视频之后,开始对待处理视频的目标帧进行编码处理之前,首先需要计算目标帧的哈希值,如果目标帧的哈希值小于或等于阈值,则确定所述目标帧是预设视频类型;如果目标帧的哈希值大于阈值,则确定所述目标帧不是预设视频类型。进一步的,根据判断的结果,选择编码处理时所需的像素精度集合。Suppose that after the encoding device obtains the video to be processed, before starting to encode the target frame of the video to be processed, it first needs to calculate the hash value of the target frame. If the hash value of the target frame is less than or equal to the threshold, the target frame is determined Is the preset video type; if the hash value of the target frame is greater than the threshold, it is determined that the target frame is not the preset video type. Further, according to the result of the judgment, a set of pixel accuracy required for encoding processing is selected.
在另一个实施例中,所述步骤S401的实现方式还可以包括:调用视频类型识别模型对所述待处理视频进行识别,得到识别结果;若所述识别结果所指示的视频类型为预设视频类型,则确定所述待处理视频的视频类型为预设视频类型。也就是说,编码设备中可存储有视频类型识别模型,该视频类型识别模式是通过包含有不同视频类型的视频样本训练得到的,编码设备调用该模型对所述待处理视频进行识别,得到识别结果。所述识别结果可以包括待处理视频属于某种视频类型的概率,将概率较高的视频类型确定为待处理视频的视频类型。例如,假设编码设备调用视频类型识别模型对待处理视频进行识别,得到的识别结果可以为自然视频30%,屏幕内容视频可以为70%,根据该识别结果,确定所述待处理视频为屏幕内容视频。In another embodiment, the implementation of step S401 may further include: calling a video type recognition model to recognize the to-be-processed video to obtain a recognition result; if the video type indicated by the recognition result is a preset video Type, it is determined that the video type of the to-be-processed video is the preset video type. That is to say, the encoding device may store a video type recognition model. The video type recognition mode is obtained through training of video samples containing different video types. The encoding device calls the model to identify the video to be processed and obtains the recognition result. The recognition result may include the probability that the video to be processed belongs to a certain video type, and the video type with a higher probability is determined as the video type of the video to be processed. For example, suppose that the encoding device calls the video type recognition model to recognize the video to be processed, and the obtained recognition result can be 30% of natural video and 70% of screen content video. According to the recognition result, the video to be processed is determined to be screen content video. .
步骤S402、如果所述待处理视频的视频类型为预设视频类型,则将初始 像素精度集合中各个像素精度值增大,得到目标像素精度集合。Step S402: If the video type of the to-be-processed video is a preset video type, increase the accuracy value of each pixel in the initial pixel accuracy set to obtain the target pixel accuracy set.
在一个实施例中,所述初始像素精度集合可以是编码设备默认的像素精度集合,也可以是编码设备在前一帧进行编码处理时所使用的像素精度集合。假设所述初始像素精度集合是编码设备默认的像素精度集合,编码设备设置初始像素精度集合的方式可以为:根据历史视频编码处理时使用的像素精度集合确定所述初始像素精度集合,例如,编码设备获取最近5次视频处理时所使用的像素精度集合,其中有4次视频编码处理时使用的像素精度集合相同均为(1/2像素精度,1/4像素精度,整像素精度),则将(1/2像素精度,1/4像素精度,整像素精度)确定为默认的初始像素精度。在其他实施例中,所述初始像素精度也可以是编码设备根据获取到的设置操作进行设置的,在编码设备进行编码处理之前,用户可以通过编码设备的用户界面设置编码时的初始像素精度集合,或者用户还可以通过用户界面进行一些其他的与编码有关的配置操作。In an embodiment, the initial pixel accuracy set may be the default pixel accuracy set of the encoding device, or may be the pixel accuracy set used by the encoding device in the encoding process of the previous frame. Assuming that the initial pixel accuracy set is the default pixel accuracy set of the encoding device, the encoding device may set the initial pixel accuracy set by: determining the initial pixel accuracy set according to the pixel accuracy set used in historical video encoding processing, for example, encoding The device obtains the pixel accuracy set used in the last 5 video processing, and the pixel accuracy set used in the 4 video encoding processing is the same (1/2 pixel accuracy, 1/4 pixel accuracy, integer pixel accuracy), then Determine (1/2 pixel accuracy, 1/4 pixel accuracy, integer pixel accuracy) as the default initial pixel accuracy. In other embodiments, the initial pixel accuracy may also be set by the encoding device according to the acquired setting operation. Before the encoding device performs encoding processing, the user can set the initial pixel accuracy set during encoding through the user interface of the encoding device. , Or the user can also perform some other coding-related configuration operations through the user interface.
在一个实施例中,本发明实施例在步骤S401中如果确定出所述待处理视频的视频类型为预设视频类型之后,则获取编码设备中当前设置的初始像素精度集合,如果所述初始像素精度集合中包括的各个像素精度满足对所述预设视频类型的视频进行编码时的像素精度要求,则可以直接使用初始像素精度集合对所述待处理图像进行编码处理;如果所述初始像素精度集合中包括的各个像素精度存在某一个或多个像素精度,不满足对所述预设视频类型的视频进行编码时的像素精度要求,则将相应的像素精度值进行修改,最后将经过修改后的初始像素精度集合,作为对所述待处理视频进行编码处理的目标像素精度集合。其中,所述对像素精度值的修改可以包括增大或者减少。In one embodiment, in the embodiment of the present invention, if it is determined in step S401 that the video type of the video to be processed is the preset video type, the initial pixel accuracy set currently set in the encoding device is acquired, if the initial pixel The accuracy of each pixel included in the accuracy set meets the pixel accuracy requirement when encoding the video of the preset video type, then the initial pixel accuracy set can be used directly to encode the image to be processed; if the initial pixel accuracy Each pixel accuracy included in the set has one or more pixel accuracy, which does not meet the pixel accuracy requirements when encoding the video of the preset video type, then the corresponding pixel accuracy value is modified, and finally the modified The initial pixel accuracy set of is used as the target pixel accuracy set for encoding the to-be-processed video. Wherein, the modification of the pixel accuracy value may include increase or decrease.
在一个实施例中,所述将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合的实现方式可以为:根据所述预设视频类型,确定像素精度值调整规则;按照像素精度值调整规则将初始像素精度集合中包括的各个像素精度值进行修改,得到目标像素精度集合。其中,在本发明实施例中,根据所述预设视频类型确定像素精度值调整规则的实现方式可以是根据预设视频类型的运动规律和待处理视频的视频内容确定像素精度值调整规则。举例来说,像素精度值调整规则为:修改后的像素精度值和修改之前相应的像素精度值之间的差值小于或等于7个像素。再如,调整规则还可以设置为:修改后的像素精度和修改之前相应的像素精度值之间的差值小于或等于1/2像素精度 等。应当理解的,上述只是本发明实施例列举的一种修改像素精度值的方法,对于具体的修改方法本发明实施例中不做具体限定。In one embodiment, the method of modifying each pixel precision value in the initial pixel precision set to obtain the target pixel precision set may be: determining the pixel precision value adjustment rule according to the preset video type; The value adjustment rule modifies each pixel precision value included in the initial pixel precision set to obtain the target pixel precision set. Wherein, in the embodiment of the present invention, the implementation manner of determining the pixel accuracy value adjustment rule according to the preset video type may be to determine the pixel accuracy value adjustment rule according to the motion rule of the preset video type and the video content of the video to be processed. For example, the pixel accuracy value adjustment rule is: the difference between the modified pixel accuracy value and the corresponding pixel accuracy value before the modification is less than or equal to 7 pixels. For another example, the adjustment rule can also be set as: the difference between the modified pixel accuracy and the corresponding pixel accuracy value before the modification is less than or equal to 1/2 pixel accuracy, etc. It should be understood that the foregoing is only a method for modifying the pixel accuracy value listed in the embodiment of the present invention, and the specific modification method is not specifically limited in the embodiment of the present invention.
在一个实施例中,即使对于同一段视频,采用不同的编码模式对其进行编码处理时所需的像素精度集合也不相同。在本发明实施例中,假设编码模式包括第一类编码模式和第二类编码模式,所述第一类编码模式可以指帧间inter编码模式和仿射affine编码模式中的任意一种,所述第二类编码模式指inter编码模式和affine编码模式中的另外一种。对应于两种编码模式,所述初始像素精度集合可包括第一初始像素精度集合和第二初始像素精度集合。其中,第一类编码模式对应的初始像素精度集合可以是第一初始像素精度集合,第二类编码模式下对应的初始像素精度集合可以是第二初始像素精度集合。在其他实施例中,第一类编码模式对应初始像素精度集合也可以是第二初始像素精度集合,第二类编码模式对应的初始像素精度集合也可以是第一初始像素精度集合。In one embodiment, even for the same piece of video, the set of pixel precisions required when encoding it with different encoding modes is different. In the embodiment of the present invention, it is assumed that the coding mode includes the first type of coding mode and the second type of coding mode. The first type of coding mode may refer to any one of the inter coding mode and the affine affine coding mode. The second type of coding mode refers to the other of the inter coding mode and the affine coding mode. Corresponding to two encoding modes, the initial pixel accuracy set may include a first initial pixel accuracy set and a second initial pixel accuracy set. Wherein, the initial pixel accuracy set corresponding to the first type of encoding mode may be the first initial pixel accuracy set, and the corresponding initial pixel accuracy set in the second type of encoding mode may be the second initial pixel accuracy set. In other embodiments, the initial pixel accuracy set corresponding to the first type of encoding mode may also be the second initial pixel accuracy set, and the initial pixel accuracy set corresponding to the second type of encoding mode may also be the first initial pixel accuracy set.
在一个实施例中,上述inter编码模式和affine编码模式的主要区别是,inter编码模式只关注了视频中平移运动的运动信息,affine编码模式关注了更多的运动信息,比如缩放、旋转,透视运动等无规则的运动。通过前述描述可知,在利用inter编码模式进行帧间预测时,帧间预测的处理对象是一帧图像中的某个编码块,而affine编码模式的处理对象不再是整个编码块,而是要将整个编码块划分为多个编码子块,将每个编码子块作为处理对象。这样一来,affine编码模式下每个编码子块都会对应一个运动矢量,多个编码子块对应的运动矢量组成了affine编码模式下的运动矢量场,affine编码模式下的运动补偿则指利用运动矢量场和参考帧得到预测帧。在一个实施例中,affine编码模式下每个编码块包括的各个编码子块的运动矢量可以通过该编码块上的控制点的参数计算得到。通常情况下,在affine编码模式下每个编码块上的控制点的数量可以为两个,也可以为三个。对于具体如何通过控制点的参数计算得到affine编码模式下每个编码块对应的各个编码字块的运动矢量可以采用现有技术中相关方法,在此不再赘述。In one embodiment, the main difference between the inter coding mode and the affine coding mode is that the inter coding mode only pays attention to the motion information of translational motion in the video, and the affine coding mode pays attention to more motion information, such as zoom, rotation, and perspective. Irregular sports such as sports. From the foregoing description, it can be seen that when using inter coding mode for inter-frame prediction, the processing object of inter-frame prediction is a certain coding block in an image, and the processing object of affine coding mode is no longer the entire coding block, but The entire coding block is divided into multiple coding sub-blocks, and each coding sub-block is used as a processing object. In this way, each coding sub-block in affine coding mode corresponds to a motion vector, and the motion vectors corresponding to multiple coding sub-blocks form the motion vector field in affine coding mode. Motion compensation in affine coding mode refers to the use of motion The vector field and the reference frame get the predicted frame. In an embodiment, the motion vector of each coding sub-block included in each coding block in the affine coding mode may be calculated through the parameters of the control points on the coding block. Generally, in the affine coding mode, the number of control points on each coding block can be two or three. As to how to calculate the motion vector of each code block corresponding to each code block in the affine coding mode through the parameter calculation of the control point, the relevant method in the prior art can be used, and it will not be repeated here.
在一个实施例中,由上述描述可知,在确定了所述待处理视频的视频类型为预设视频类型后,为待处理视频确定对应的目标像素精度集合之前,首先要判断对所述待处理视频进行编码处理时所使用的编码模式,然后再进一步根据 编码模式选择需要调整的初始像素精度集合,最后对需要调整的初始像素精度集合中各个像素精度进行增大调整,得到目标像素精度集合。In one embodiment, it can be seen from the above description that after determining that the video type of the video to be processed is a preset video type, before determining the corresponding target pixel accuracy set for the video to be processed, it is first necessary to determine The encoding mode used in the video encoding process, and then further select the initial pixel accuracy set to be adjusted according to the encoding mode, and finally increase the accuracy of each pixel in the initial pixel accuracy set to be adjusted to obtain the target pixel accuracy set.
具体地,所述初始像素精度集合包括第一初始像素精度集合和第二初始像素精度集合,所述目标像素精度集合包括第一目标像素精度集合和第二目标像素精度集合,所述将初始像素精度集合中各个像素精度值增大,得到目标像素精度集合,包括:获取对所述待处理视频进行编码处理时使用的编码模式;若所述编码模式为第一类编码模式,则将所述第一初始像素精度集合中各个像素精度值进行修改,得到第一目标像素精度集合;若所述编码模式为第二类编码模式,则将所述第二初始像素精度集合中各个像素精度值进行修改,得到第二目标像素精度集合。Specifically, the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set, the target pixel accuracy set includes a first target pixel accuracy set and a second target pixel accuracy set, and the initial pixel accuracy set The precision value of each pixel in the precision set is increased to obtain the target pixel precision set, including: acquiring the coding mode used when encoding the video to be processed; if the coding mode is the first type of coding mode, then Each pixel accuracy value in the first initial pixel accuracy set is modified to obtain the first target pixel accuracy set; if the encoding mode is the second type of encoding mode, each pixel accuracy value in the second initial pixel accuracy set is performed Modify to obtain the second target pixel accuracy set.
在一个实施例中,所述对所述待处理视频进行编码处理时使用的编码模式可以是编码设备根据所述待处理视频中包括的运动信息进行选择的,具体地,如果编码设备判断出所述待处理视频中包括旋转、平移、缩放等多种运动信息,则可选择第一类编码模式为所述待处理视频进行编码;如果编码设备判断出所述待处理视频中只包括平移运动信息,则可选择第二类编码模式为所述待处理视频进行编码。在其他实施例中,所述对所述待处理视频进行编码处理时使用的编码模式也可以编码设备根据用户在用户界面输入的设置操作确定的。In an embodiment, the encoding mode used when encoding the video to be processed may be selected by the encoding device according to the motion information included in the video to be processed, specifically, if the encoding device determines that If the video to be processed includes multiple motion information such as rotation, translation, zooming, etc., the first type of encoding mode can be selected to encode the video to be processed; if the encoding device determines that the video to be processed only includes translation motion information , The second type of encoding mode can be selected to encode the to-be-processed video. In other embodiments, the encoding mode used when encoding the to-be-processed video may also be determined by the encoding device according to a setting operation input by the user on the user interface.
举例来说,假设编码设备设置了在第一编码模式下,第一初始像素精度集合为(1/2像素精度,整像素精度,1/4像素精度);在第二编码模式下,第二初始像素精度集合为(1/4像素精度,1/8像素精度,1/16像素精度)。假设编码设备获取到一段待处理视频,且判断所述待处理视频的视频类型为预设视频类型,进一步的,编码设备确定对所述待处理视频进行编码处理时所需的编码模式,如果确定所述编码模式为第一类编码模式,则将第一初始像素精度集合(整像素精度,1/2像素精度,1/4像素精度)中各个像素精度值进行修改,得到第一目标像素精度集合,可以表示为(整像素精度,4像素精度和8像素精度);如果确定所述编码模式为第二类编码模式,则将第二初始像素精度集合(1/4像素精度,1/8像素精度,1/16像素精度)中各个像素精度值进行修改,得到第二目标像素精度集合,可以表示为(1/2像素精度,整像素精度,2像素精度)。For example, assuming that the encoding device is set in the first encoding mode, the first initial pixel accuracy set is (1/2 pixel accuracy, integer pixel accuracy, 1/4 pixel accuracy); in the second encoding mode, the second The initial pixel accuracy set is (1/4 pixel accuracy, 1/8 pixel accuracy, 1/16 pixel accuracy). Assuming that the encoding device obtains a piece of video to be processed, and determines that the video type of the to-be-processed video is a preset video type, further, the encoding device determines the encoding mode required for encoding the to-be-processed video, if it is determined If the coding mode is the first type of coding mode, each pixel precision value in the first initial pixel precision set (integer pixel precision, 1/2 pixel precision, 1/4 pixel precision) is modified to obtain the first target pixel precision The set can be expressed as (integer pixel accuracy, 4 pixel accuracy, and 8 pixel accuracy); if the encoding mode is determined to be the second type of encoding mode, the second initial pixel accuracy set (1/4 pixel accuracy, 1/8 Each pixel precision value in the pixel precision, 1/16 pixel precision) is modified to obtain the second target pixel precision set, which can be expressed as (1/2 pixel precision, integer pixel precision, 2 pixel precision).
步骤S403、基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。Step S403: Perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
在一个实施例中,通过步骤S402确定了所述待处理视频对应的目标像素精度集合之后,基于所述目标像素精度集合对所述待处理视频进行帧间预测,进而对帧间预测后的待处理视频进行其他的编码处理,比如变换、量化、熵编码等,得到编码视频。其中,所述基于所述目标像素精度集合对所述待处理视频进行帧间预测实质上是对所述待处理视频的每帧图像进行帧间预测。In one embodiment, after the target pixel accuracy set corresponding to the video to be processed is determined in step S402, the to-be-processed video is inter-predicted based on the target pixel accuracy set, and then the to-be-processed video is inter-predicted. Process the video and perform other encoding processing, such as transformation, quantization, entropy encoding, etc., to obtain an encoded video. Wherein, the performing inter-frame prediction on the to-be-processed video based on the target pixel accuracy set is essentially performing inter-frame prediction on each frame of the to-be-processed video.
在一个实施例中,对所述处理视频进行编码处理得到编码视频后,可以以码流的形式将所述编码视频传输给解码端,由解码端对编码视频进行解码处理,解码端可以将解码得到的视频传输给显示设备,由显示设备进行显示。通过对上述待处理视频进行编码、解码、显示处理之后,待处理视频中包括的一些冗余信息被消除,大大减小了待处理视频的数据量,提高了视频传输效率,同时也提高了用户观看体验。比如,假设一段一秒钟的数字电视视频,通过1M的传输宽带进行传输,在未进行编码处理的情况下,需要传输9秒钟,也就是用户需要等到9秒钟才能观看一段一秒钟的数字电视视频;在采用本发明实施例的视频处理方法进行处理后,可能仅需要从需要传输1秒钟。In one embodiment, after encoding the processed video to obtain the encoded video, the encoded video can be transmitted to the decoding end in the form of a bit stream, and the decoding end can decode the encoded video, and the decoding end can decode the encoded video. The obtained video is transmitted to the display device and displayed by the display device. After the above-mentioned video to be processed is encoded, decoded, and displayed, some redundant information included in the video to be processed is eliminated, which greatly reduces the data volume of the video to be processed, improves the video transmission efficiency, and also improves the user The viewing experience. For example, suppose a one-second digital TV video is transmitted through 1M transmission broadband. Without encoding processing, it needs to be transmitted for 9 seconds. That is, the user needs to wait for 9 seconds to watch a one-second video. Digital TV video; after the video processing method of the embodiment of the present invention is used for processing, it may only need to be transmitted for 1 second from time to time.
在一个实施例中,基于所述目标像素精度结对所述目标待处理视频中每帧图像进行编码处理后,为每帧图像添加能够标识该帧图像所属视频类型的标识信息,将所述待处理视频中各帧图像都进行上述处理之后,得到编码视频,所述编码视频中包括了标识信息,将编码视频和标识信息发送给解码端,以使得解码端根据每帧图像的编码情况,进行解码。具体地,所述基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频,包括:为所述编码视频添加标识信息;将添加了所述标识信息的编码视频发送给解码端,所述标识信息用于指示所述解码端基于所述标识信息对所述编码视频进行解码。In one embodiment, after encoding each frame of the target video to be processed based on the target pixel accuracy, each frame of image is added with identification information that can identify the video type to which the frame of image belongs, and the to-be-processed After each frame image in the video is processed as described above, an encoded video is obtained. The encoded video includes identification information. The encoded video and identification information are sent to the decoding end, so that the decoding end decodes according to the encoding situation of each frame image . Specifically, the encoding process on the to-be-processed video based on the target pixel accuracy set to obtain the encoded video includes: adding identification information to the encoded video; and sending the encoded video with the identification information added to the decoding At the end, the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
在一个实施例中,编码设备在确定了所述待处理视频对应的目标像素精度集合之后,可以为目标像素精度集合中包括的各个像素精度设置索引标识,这样一来,编码设备通过索引标识便可知道对某一帧图像进行编码处理时所使用的像素精度。其中,目标像素精度集合中包括的各像素精度设置的索引标识可以为:假设目标像素精度集合为(整像素精度,4像素精度,8像素精度), 索引标识为:0表示整像素精度,00表示4像素精度,01表示8像素精度。In one embodiment, after determining the target pixel accuracy set corresponding to the video to be processed, the encoding device may set an index mark for each pixel accuracy included in the target pixel accuracy set. In this way, the encoding device can use the index mark You can know the pixel accuracy used when encoding a certain frame of image. Among them, the index identification of each pixel accuracy setting included in the target pixel accuracy set may be: assuming that the target pixel accuracy set is (integer pixel accuracy, 4 pixel accuracy, 8 pixel accuracy), the index identification is: 0 means integer pixel accuracy, 00 Represents 4-pixel accuracy, and 01 represents 8-pixel accuracy.
编码设备在对待处理视频进行编码处理得到编码视频后,可以将索引标识和编码视频一起发送给解码器,所述索引标识用于指示所述解码端基于所述索引标识对编码视频进行解码。具体地,所述基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频,包括:为目标像素精度集合中包括的各个像素精度设置索引标识;确定对所述待处理视频进行编码处理时使用的所述目标像素精度集合中的目标像素精度,以及所述目标像素精度对应的索引标识;为所述编码视频添加所述索引标识,并将添加了所述索引标识的编码视频发送给解码端,所述索引标识用于指示所述解码端基于所述索引标识对编码视频进行解码。After encoding the video to be processed to obtain the encoded video, the encoding device may send the index identifier and the encoded video together to the decoder, where the index identifier is used to instruct the decoding end to decode the encoded video based on the index identifier. Specifically, the encoding process of the to-be-processed video based on the target pixel accuracy set to obtain the encoded video includes: setting an index identifier for each pixel accuracy included in the target pixel accuracy set; determining that the to-be-processed video The target pixel precision in the target pixel precision set used in encoding processing, and the index mark corresponding to the target pixel precision; add the index mark to the encoded video, and add the code of the index mark The video is sent to the decoding terminal, and the index identifier is used to instruct the decoding terminal to decode the encoded video based on the index identifier.
本发明实施例中,编码设备对获取到的待处理视频的视频类型进行判断,如果所述待处理视频的视频类型为预设视频类型时,则将初始像素精度集合中各个像素精度值增大,得到目标像素精度集合,进一步的,基于所述目标像素精度集合对待处理视频进行编码处理,得到编码视频。上述对待处理视频进行编码处理的过程中,根据待处理视频的视频类型确定编码处理时使用的目标像素精度集合,实现了有针对性的为不同视频类型的待处理视频选择目标像素精度集合,从而可提高编码视频的质量。In the embodiment of the present invention, the encoding device judges the acquired video type of the to-be-processed video, and if the video type of the to-be-processed video is a preset video type, the accuracy value of each pixel in the initial pixel accuracy set is increased , Obtain a target pixel accuracy set, and further, perform encoding processing on the video to be processed based on the target pixel accuracy set to obtain an encoded video. In the above process of encoding the video to be processed, the target pixel accuracy set used in the encoding process is determined according to the video type of the video to be processed, so that targeted pixel accuracy sets are selected for the to-be-processed videos of different video types. Can improve the quality of encoded video.
参考图5,为本发明实施例提供的一种编码系统,所述编码系统中可包括编码端501和解码端502,所述编码端501和所述解码端502可以配置于同一终端设备中,或者所述编码端501和所述解码端502也可以是相互独立的两个设备。在图5所述的编码系统中,所述编码端501用于采用合适的像素精度集合对待处理视频进行压缩编码处理,以减少待处理视频中包括的冗余信息,编码端501将对待处理视频进行编码处理后得到的编码视频发送给解码端,解码端502采用与编码端相对应的像素精度集合以及其他编码信息对编码视频进行解码处理。5, it is an encoding system provided by an embodiment of the present invention. The encoding system may include an encoding terminal 501 and a decoding terminal 502. The encoding terminal 501 and the decoding terminal 502 may be configured in the same terminal device, Or the encoding end 501 and the decoding end 502 may also be two independent devices. In the encoding system shown in FIG. 5, the encoding terminal 501 is used to compress and encode the video to be processed using a suitable pixel accuracy set to reduce redundant information included in the video to be processed, and the encoding terminal 501 will compress and encode the video to be processed. The encoded video obtained after the encoding process is sent to the decoding end, and the decoding end 502 uses the pixel precision set corresponding to the encoding end and other encoding information to decode the encoded video.
参考图6,为本发明实施例提供的一种交互图,下面结合图6来描述图5中编码端501和解码端502在对视频进行编解码处理时的交互流程。在一个实施例中,编码端501在步骤S601中获取到待处理视频之后,确定所述待处理 视频的视频类型。根据待处理视频的视频类型选择对待处理视频进行编码处理时所需的目标像素精度集合。具体地,如果编码端501在步骤S602中确定出待处理视频的视频类型为预设视频类型,则将编码端501中存储的初始像素精度集合中各个像素精度进行修改,得到目标像素精度集合,进一步的在步骤S603中基于目标像素精度集合对所述待处理视频进行编码处理;如果编码端501确定出待处理视频的视频类型不是预设视频类型,则使用解码端501中存储的初始像素进行集合对所述待处理视频进行编码处理。Referring to FIG. 6, which is an interaction diagram provided by an embodiment of the present invention, the interaction flow between the encoding end 501 and the decoding end 502 in FIG. 5 when encoding and decoding a video is described below in conjunction with FIG. 6. In one embodiment, after obtaining the video to be processed in step S601, the encoding terminal 501 determines the video type of the video to be processed. According to the video type of the video to be processed, the target pixel accuracy set required for encoding the video to be processed is selected. Specifically, if the encoding terminal 501 determines in step S602 that the video type of the video to be processed is the preset video type, each pixel accuracy in the initial pixel accuracy set stored in the encoding terminal 501 is modified to obtain the target pixel accuracy set. Further, in step S603, the to-be-processed video is encoded based on the target pixel accuracy set; if the encoding end 501 determines that the video type of the to-be-processed video is not a preset video type, the initial pixel stored in the decoding end 501 is used for processing. The collection performs encoding processing on the to-be-processed video.
可选的,步骤S602中所述的预设视频类型可以包括屏幕内容视频和自然视频中的任意一种或多种,或者预设视频类型也可以包括长视频和短视频中的任意一种,或者预设视频类型还可以为其他视频类型,本发明实施例不对预设视频类型进行限定。在一个实施例中,对于编码端501确定待处理视频的视频类型的方法可参见图4实施例中相关内容的描述,在此不再赘述。Optionally, the preset video type described in step S602 may include any one or more of screen content video and natural video, or the preset video type may also include any one of long video and short video. Or the preset video type may also be other video types, and the embodiment of the present invention does not limit the preset video type. In an embodiment, for the method for the encoding terminal 501 to determine the video type of the video to be processed, refer to the description of the related content in the embodiment of FIG. 4, which will not be repeated here.
编码端501对待处理视频进行编码的编码模式可以第一类编码模式和第二类编码模式,其中第一类编码模式可以包括inter编码模式和affine编码模式中的任意一种,所述第二类编码模式可以包括inter编码模式和affine编码模式中的一种。在不同的编码模式下,编码端501对应的初始像素精度集合也不相同。具体地,第一类编码模式对应的初始像素精度集合可以为第一初始像素精度集合,第二类编码模式对应的初始像素精度集合可以为第二初始像素精度集合。在编码端501确定出所述待处理视频的视频类型为预设视频类型后,确定目标像素精度集合之前,编码端501还需要确定出编码模式:如果编码模式为第一类型编码模式,则编码端501将第一初始像素精度集合中各个像素精度值进行调整,得到第一目标像素精度集合;如果编码模式为第二类编码模式,则编码端501将第二初始像素精度集合中的各个像素精度值进行调整,得到第二目标像素精度集合。The encoding mode of the encoding terminal 501 for encoding the video to be processed can be the first type of encoding mode and the second type of encoding mode. The first type of encoding mode may include any one of the inter encoding mode and the affine encoding mode. The coding mode may include one of inter coding mode and affine coding mode. In different encoding modes, the initial pixel accuracy sets corresponding to the encoding end 501 are different. Specifically, the initial pixel accuracy set corresponding to the first type of encoding mode may be the first initial pixel accuracy set, and the initial pixel accuracy set corresponding to the second type of encoding mode may be the second initial pixel accuracy set. After the encoding end 501 determines that the video type of the video to be processed is the preset video type, before determining the target pixel accuracy set, the encoding end 501 also needs to determine the encoding mode: if the encoding mode is the first type encoding mode, then encode The end 501 adjusts the precision values of each pixel in the first initial pixel accuracy set to obtain the first target pixel accuracy set; if the encoding mode is the second type of encoding mode, the encoding end 501 will adjust each pixel in the second initial pixel accuracy set The precision value is adjusted to obtain the second target pixel precision set.
在图5所述的编码系统中,编码端501通过步骤S603基于目标像素精度集合对待处理视频进行编码处理,得到编码视频后,还可以通过步骤S604为编码视频添加标识信息,并将添加了标识信息的编码视频发送解码端502。其中,所述标识信息是指用于标识编码视频对应的视频类型为预设视频类型的信息,换句话说,解码端502若检测到编码视频中包括此标识信息,则可确定编码视频对应的视频类型为预设视频类型。In the encoding system described in FIG. 5, the encoding terminal 501 performs encoding processing on the video to be processed based on the target pixel accuracy set in step S603, and after obtaining the encoded video, it can also add identification information to the encoded video in step S604, and add the identification The encoded video of the information is sent to the decoding terminal 502. Wherein, the identification information refers to information used to identify that the video type corresponding to the encoded video is a preset video type. In other words, if the decoding terminal 502 detects that the encoded video includes this identification information, it can determine that the encoded video corresponds to the The video type is the preset video type.
解码端502在步骤S605中接收编码端501发送的编码视频,并可以提取编码视频中包括的信息,如果在步骤S606中确定出所述编码视频中包括标识信息时,则确定接收到的编码视频对应的视频类型为预设视频类型;进一步的,解码端502可以获取到编码端501为预设视频类型设置的目标像素精度集合,并通过步骤S607基于目标像素精度集合对所述编码视频进行解码处理。The decoding terminal 502 receives the encoded video sent by the encoding terminal 501 in step S605, and can extract the information included in the encoded video. If it is determined in step S606 that the encoded video includes identification information, then the received encoded video is determined The corresponding video type is the preset video type; further, the decoding end 502 can obtain the target pixel accuracy set set by the encoding end 501 for the preset video type, and decode the encoded video based on the target pixel accuracy set through step S607 deal with.
在一个实施例中,编码端501在执行步骤S603得到编码视频之后,除了为编码视频添加标识信息,还可以为编码视频中添加索引标识,所述索引标识是用来标识编码端501对待处理视频进行编码处理时所使用的目标像素精度,该目标像素精度是属于目标像素精度集合中的任意一个。可以理解的,待处理视频是由多帧图像组成的,因此对待处理视频进行视频编码处理时也是以帧为单位进行处理的,对各帧图像进行编码处理之后,以码流的形式将包括各帧图像编码结果的编码视频发送给解码端。所以,上述对待处理视频进行编码处理时所使用的目标像素精度实质上是说对待处理视频的各帧图像而言的。In one embodiment, after the encoding terminal 501 performs step S603 to obtain the encoded video, in addition to adding identification information to the encoded video, it can also add an index identifier to the encoded video. The index identifier is used to identify the video to be processed by the encoding terminal 501 The target pixel accuracy used in the encoding process, and the target pixel accuracy belongs to any one of the target pixel accuracy set. It is understandable that the video to be processed is composed of multiple frames of images. Therefore, the video encoding process of the video to be processed is also processed in units of frames. After encoding each frame of image, it will include each frame in the form of code stream. The encoded video of the frame image encoding result is sent to the decoding end. Therefore, the foregoing target pixel accuracy used when encoding the video to be processed is essentially for each frame of the video to be processed.
编码端501将添加了索引标识的编码视频发送给解码端502,解码端502从目标像素精度集合中,确定出所述索引标识所标识的目标像素精度;进一步的,基于目标像素精度对所述编码视频进行解码处理。解码端502基于目标精度对编码视频进行解码处理后,得到可显示或者可存储的数据量较小的视频。The encoding end 501 sends the encoded video with the index mark to the decoding end 502, and the decoding end 502 determines the target pixel accuracy identified by the index mark from the target pixel accuracy set; The encoded video is decoded. After the decoding end 502 decodes the encoded video based on the target accuracy, a video with a small amount of data that can be displayed or stored is obtained.
在本发明实施例提供的编码系统中,编码端501根据待处理视频的视频类型,设置相应的目标像素精度集合对待处理视频进行编码处理,提高了视频编码的质量,解码端502根据编码端501发送的编码视频中包括的标识信息可以准确的选择目标像素精度集合,对编码视频进行解码,可保证视频内容不被破坏,提高了解码质量。In the encoding system provided by the embodiment of the present invention, the encoding terminal 501 sets the corresponding target pixel accuracy set according to the video type of the video to be processed to encode the video to be processed, which improves the quality of video encoding. The identification information included in the transmitted coded video can accurately select the target pixel accuracy set, and decode the coded video, which can ensure that the video content is not damaged and improve the decoding quality.
参考图7,为本发明实施例提供的一种编码设备的结构示意图,如图7所述的编码设备可包括:存储器701和处理器702,其中存储器701和处理器702通过总线703连接,存储器701中存储有程序代码,存储器702调用存储器701中的程序代码。Referring to FIG. 7, it is a schematic structural diagram of an encoding device provided by an embodiment of the present invention. The encoding device as described in FIG. 7 may include: a memory 701 and a processor 702, where the memory 701 and the processor 702 are connected through a bus 703, and the memory Program codes are stored in 701, and the memory 702 calls the program codes in the memory 701.
所述存储器701可以包括易失性存储器(volatile memory),如随机存取存储器(random-access memory,RAM);存储器701也可以包括非易失性存储器(non-volatile memory),如快闪存储器(flash memory),固态硬盘(solid-state  drive,SSD)等;存储器701还可以包括上述种类的存储器的组合。The memory 701 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 701 may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory), solid-state drive (SSD), etc.; the memory 701 may also include a combination of the foregoing types of memories.
所述处理器702可以是中央处理器(Central Processing Unit,CPU)。所述处理器702还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)等。该PLD可以是现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)等。所述处理器702也可以为上述结构的组合。The processor 702 may be a central processing unit (Central Processing Unit, CPU). The processor 702 may further include a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), etc. The PLD may be a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), etc. The processor 702 may also be a combination of the foregoing structures.
本发明实施例中,所述存储器701用于存储计算机程序,所述计算机程序包括程序指令,处理器702用于执行存储器701存储的程序指令,用来实现上述图4所示的实施例中的相应方法的步骤。In the embodiment of the present invention, the memory 701 is used to store a computer program, and the computer program includes program instructions. The processor 702 is used to execute the program instructions stored in the memory 701 to implement the above-mentioned embodiment shown in FIG. 4 The steps of the corresponding method.
在一个实施例中,所述处理器702被配置调用所述程序指令时执行:确定获取到的待处理视频的视频类型;如果所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。In one embodiment, the processor 702 is configured to execute when the program instructions are called: determine the acquired video type of the to-be-processed video; if the video type of the to-be-processed video is a preset video type, the initial Each pixel precision value in the pixel precision set is modified to obtain a target pixel precision set; the to-be-processed video is encoded based on the target pixel precision set to obtain an encoded video.
在一个实施例中,所述处理器702在确定获取到的待处理视频的视频类型时,执行如下操作:确定所述待处理视频对应的哈希值;若所述哈希值不大于阈值,则确定所述待处理视频的视频类型为预设视频类型;若所述哈希值大于所述阈值,则确定所述待处理视频的视频类型不是所述预设视频类型。In one embodiment, when determining the video type of the acquired video to be processed, the processor 702 performs the following operations: determining the hash value corresponding to the video to be processed; if the hash value is not greater than the threshold, It is determined that the video type of the video to be processed is a preset video type; if the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type.
在一个实施例中,所述处理器702在确定获取到的待处理视频的视频类型时,执行如下操作:调用视频类型识别模型对所述待处理视频进行识别,得到识别结果;若所述识别结果所指示的视频类型为预设视频类型,则确定所述待处理视频的视频类型为预设视频类型。In one embodiment, when the processor 702 determines the acquired video type of the to-be-processed video, it performs the following operations: calling the video type recognition model to recognize the to-be-processed video and obtain the recognition result; if the recognition is The result indicates that the video type is the preset video type, and it is determined that the video type of the to-be-processed video is the preset video type.
在一个实施例中,所述初始像素精度集合包括第一初始像素精度集合和第二初始像素精度集合,所述目标像素精度集合包括第一目标像素精度集合和第二目标像素精度集合,所述处理器在将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合时,执行如下操作:获取对所述待处理视频进行编码处理时使用的编码模式;若所述编码模式为第一类编码模式,则将所述第一初始像素精度集合中各个像素精度值进行修改,得到第一目标像素精度集 合;若所述编码模式为第二类编码模式,则将所述第二初始像素精度集合中各个像素精度值进行修改,得到第二目标像素精度集合。In one embodiment, the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set, and the target pixel accuracy set includes a first target pixel accuracy set and a second target pixel accuracy set. When the processor modifies each pixel precision value in the initial pixel precision set to obtain the target pixel precision set, it performs the following operations: obtains the encoding mode used when encoding the video to be processed; if the encoding mode is the first For the first type of encoding mode, each pixel accuracy value in the first initial pixel accuracy set is modified to obtain the first target pixel accuracy set; if the encoding mode is the second type of encoding mode, the second initial Each pixel precision value in the pixel precision set is modified to obtain a second target pixel precision set.
在一个实施例中,所述第一类编码模式可以包括帧间inter编码模式和仿射affine编码模式中的任意一种,所述第二类编码模式可以包括帧间inter编码模式和所述affine编码模式中的另外一种。In an embodiment, the first type of coding mode may include any one of an inter coding mode and an affine coding mode, and the second type of coding mode may include an inter coding mode and the affine coding mode. Another encoding mode.
在一个实施例中,所述处理器702在基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频时,执行如下操作:为所述编码视频添加标识信息;将添加了所述标识信息的编码视频发送给解码端,所述标识信息用于指示所述解码端基于所述标识信息对所述编码视频进行解码。In one embodiment, when the processor 702 encodes the to-be-processed video based on the target pixel accuracy set to obtain an encoded video, it performs the following operations: add identification information to the encoded video; add The encoded video of the identification information is sent to the decoding end, and the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
在一个实施例中,所述处理器702在基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频时,执行如下操作:为目标像素精度集合中包括的各个像素精度设置索引标识;确定对所述待处理视频进行编码处理时使用的所述目标像素精度集合中的目标像素精度,以及所述目标像素精度对应的索引标识;为所述编码视频添加所述索引标识,并将添加了所述索引标识的编码视频发送给解码端,所述索引标识用于指示所述解码端基于所述索引标识对编码视频进行解码。In one embodiment, when the processor 702 encodes the to-be-processed video based on the target pixel accuracy set to obtain an encoded video, it performs the following operations: set the accuracy of each pixel included in the target pixel accuracy set Index identification; determining the target pixel accuracy in the target pixel accuracy set used when encoding the video to be processed, and the index identification corresponding to the target pixel accuracy; adding the index identification to the encoded video, And send the coded video to which the index identifier is added to the decoding terminal, where the index identifier is used to instruct the decoding terminal to decode the coded video based on the index identifier.
参考图8,为本发明实施例提供的一种解码设备的结构示意图,如图8所述的解码设备可包括:存储器801和处理器802,其中存储器801和处理器802通过总线803连接,存储器801中存储有程序代码,存储器802调用存储器801中的程序代码。Referring to FIG. 8, it is a schematic structural diagram of a decoding device provided by an embodiment of the present invention. The decoding device described in FIG. 8 may include: a memory 801 and a processor 802. The memory 801 and the processor 802 are connected through a bus 803. Program codes are stored in 801, and the memory 802 calls the program codes in the memory 801.
所述存储器801可以包括易失性存储器(volatile memory),如随机存取存储器(random-access memory,RAM);存储器801也可以包括非易失性存储器(non-volatile memory),如快闪存储器(flash memory),固态硬盘(solid-state drive,SSD)等;存储器801还可以包括上述种类的存储器的组合。The memory 801 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 801 may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory), solid-state drive (solid-state drive, SSD), etc.; the memory 801 may also include a combination of the foregoing types of memories.
所述处理器802可以是中央处理器(Central Processing Unit,CPU)。所述处理器802还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)等。该PLD可以是现场可编程逻辑门阵列 (field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)等。所述处理器802也可以为上述结构的组合。The processor 802 may be a central processing unit (Central Processing Unit, CPU). The processor 802 may further include a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), etc. The PLD may be a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), etc. The processor 802 may also be a combination of the foregoing structures.
本发明实施例中,所述存储器801用于存储计算机程序,所述计算机程序包括程序指令,处理器802用于执行存储器801存储的程序指令。In the embodiment of the present invention, the memory 801 is used to store a computer program, and the computer program includes program instructions, and the processor 802 is used to execute the program instructions stored in the memory 801.
在一个实施例中,所述处理器802被配置调用所述程序指令时执行:接收编码视频;当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为预设视频类型;基于目标像素精度集合对所述编码视频进行解码处理;所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。In one embodiment, the processor 802 is configured to call the program instructions to execute: receive an encoded video; when the encoded video includes identification information, determine that the video type corresponding to the encoded video is a preset video type The coded video is decoded based on the target pixel accuracy set; the target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
在一个实施例中,所述编码视频中还包括索引标识,所述处理器802在基于目标像素精度集合对所述编码视频进行解码处理时,执行如下操作:从所述目标像素精度集合中,确定所述索引标识所标识的目标像素精度;基于所述目标像素精度对所述编码视频进行解码处理。In one embodiment, the coded video further includes an index identifier, and the processor 802 performs the following operations when decoding the coded video based on the target pixel accuracy set: From the target pixel accuracy set, Determine the target pixel accuracy identified by the index identifier; and perform decoding processing on the encoded video based on the target pixel accuracy.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上所揭露的仅为本发明部分实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。The above-disclosed are only some embodiments of the present invention, which of course cannot be used to limit the scope of rights of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.

Claims (21)

  1. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    确定获取到的待处理视频的视频类型;Determine the video type of the acquired video to be processed;
    如果所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;If the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
    基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。Perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
  2. 如权利要求1所述的方法,其特征在于,所述确定获取到的待处理视频的视频类型,包括:The method of claim 1, wherein the determining the acquired video type of the to-be-processed video comprises:
    确定所述待处理视频对应的哈希值;Determine the hash value corresponding to the video to be processed;
    若所述哈希值不大于阈值,则确定所述待处理视频的视频类型为预设视频类型;If the hash value is not greater than the threshold, determining that the video type of the video to be processed is a preset video type;
    若所述哈希值大于所述阈值,则确定所述待处理视频的视频类型不是所述预设视频类型。If the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type.
  3. 如权利要求1所述的方法,其特征在于,所述确定获取到的待处理视频的视频类型,包括:The method of claim 1, wherein the determining the acquired video type of the to-be-processed video comprises:
    调用视频类型识别模型对所述待处理视频进行识别,得到识别结果;Calling a video type recognition model to recognize the to-be-processed video, and obtain a recognition result;
    若所述识别结果所指示的视频类型为预设视频类型,则确定所述待处理视频的视频类型为预设视频类型。If the video type indicated by the recognition result is a preset video type, it is determined that the video type of the video to be processed is the preset video type.
  4. 如权利要求1所述的方法,其特征在于,所述初始像素精度集合包括第一初始像素精度集合和第二初始像素精度集合,所述目标像素精度集合包括第一目标像素精度集合和第二目标像素精度集合,The method of claim 1, wherein the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set, and the target pixel accuracy set includes a first target pixel accuracy set and a second initial pixel accuracy set. Target pixel accuracy collection,
    所述将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合,包括:The modification of each pixel precision value in the initial pixel precision set to obtain the target pixel precision set includes:
    获取对所述待处理视频进行编码处理时使用的编码模式;Acquiring an encoding mode used when encoding the to-be-processed video;
    若所述编码模式为第一类编码模式,则将所述第一初始像素精度集合中各个像素精度值进行修改,得到第一目标像素精度集合;If the coding mode is the first type of coding mode, modify each pixel precision value in the first initial pixel precision set to obtain a first target pixel precision set;
    若所述编码模式为第二类编码模式,则将所述第二初始像素精度集合中各个像素精度值进行修改,得到第二目标像素精度集合。If the coding mode is the second type of coding mode, each pixel precision value in the second initial pixel precision set is modified to obtain a second target pixel precision set.
  5. 如权利要求4所述的方法,其特征在于,所述第一类编码模式包括帧间inter编码模式和仿射affine编码模式中的任意一种,所述第二类编码模式包括所述inter编码模式和所述affine编码模式中的另外一种。The method according to claim 4, wherein the first type of coding mode includes any one of an inter-frame coding mode and an affine coding mode, and the second type of coding mode includes the inter coding mode. Mode and the other of the affine coding mode.
  6. 如权利要求1所述的方法,其特征在于,所述基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频,包括:The method according to claim 1, wherein the encoding the to-be-processed video based on the target pixel accuracy set to obtain the encoded video comprises:
    为所述编码视频添加标识信息;Adding identification information to the encoded video;
    将添加了所述标识信息的编码视频发送给解码端,所述标识信息用于指示所述解码端基于所述标识信息对所述编码视频进行解码。Send the encoded video to which the identification information is added to the decoding end, where the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
  7. 如权利要求1所述的方法,其特征在于,所述基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频,包括:The method according to claim 1, wherein the encoding the to-be-processed video based on the target pixel accuracy set to obtain the encoded video comprises:
    为目标像素精度集合中包括的各个像素精度设置索引标识;Set an index mark for each pixel precision included in the target pixel precision set;
    确定对所述待处理视频进行编码处理时使用的所述目标像素精度集合中的目标像素精度,以及所述目标像素精度对应的索引标识;Determining the target pixel accuracy in the target pixel accuracy set used when encoding the video to be processed, and the index identifier corresponding to the target pixel accuracy;
    为所述编码视频添加所述索引标识,并将添加了所述索引标识的编码视频发送给解码端,所述索引标识用于指示所述解码端基于所述索引标识对编码视频进行解码。The index identifier is added to the encoded video, and the encoded video added with the index identifier is sent to a decoding end, where the index identifier is used to instruct the decoding end to decode the encoded video based on the index identifier.
  8. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    接收编码视频;Receive encoded video;
    当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为预设视频类型;When the coded video includes identification information, determining that the video type corresponding to the coded video is a preset video type;
    基于目标像素精度集合对所述编码视频进行解码处理;Decoding the encoded video based on the target pixel accuracy set;
    所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。The target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  9. 如权利要求8所述方法,其特征在于,所述编码视频中还包括索引标识,所述基于目标像素精度集合对所述编码视频进行解码处理,包括:The method according to claim 8, wherein the coded video further includes an index identifier, and the decoding processing of the coded video based on the target pixel precision set comprises:
    从所述目标像素精度集合中,确定所述索引标识所标识的目标像素精度;From the target pixel accuracy set, determine the target pixel accuracy identified by the index identifier;
    基于所述目标像素精度对所述编码视频进行解码处理。Performing decoding processing on the encoded video based on the target pixel accuracy.
  10. 一种视频处理装置,其特征在于,包括:A video processing device, characterized by comprising:
    确定单元,用于确定获取到的待处理视频的视频类型;The determining unit is used to determine the video type of the acquired video to be processed;
    处理单元,用于如果确定单元确定出所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;A processing unit, configured to, if the determining unit determines that the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
    所述处理单元,还用于基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。The processing unit is further configured to perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
  11. 一种视频处理装置,其特征在于,包括:A video processing device, characterized by comprising:
    接收单元,用于接收编码视频;Receiving unit for receiving encoded video;
    处理单元,用于当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为预设视频类型;A processing unit, configured to determine that the video type corresponding to the coded video is a preset video type when the coded video includes identification information;
    所述处理单元,还用于基于目标像素精度集合对所述编码视频进行解码处理;The processing unit is further configured to decode the encoded video based on the target pixel accuracy set;
    所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。The target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  12. 一种编码设备,其特征在于,包括存储器和处理器:An encoding device, characterized in that it comprises a memory and a processor:
    所述存储器,用于存储程序代码;The memory is used to store program code;
    所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行如下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
    确定获取到的待处理视频的视频类型;Determine the video type of the acquired video to be processed;
    如果所述待处理视频的视频类型为预设视频类型,则将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合;If the video type of the video to be processed is a preset video type, modify each pixel precision value in the initial pixel precision set to obtain a target pixel precision set;
    基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频。Perform encoding processing on the to-be-processed video based on the target pixel accuracy set to obtain an encoded video.
  13. 如权利要求12所述的编码设备,其特征在于,所述处理器在确定获取到的待处理视频的视频类型时,执行如下操作:The encoding device according to claim 12, wherein the processor performs the following operations when determining the acquired video type of the to-be-processed video:
    确定所述待处理视频对应的哈希值;Determine the hash value corresponding to the video to be processed;
    若所述哈希值不大于阈值,则确定所述待处理视频的视频类型为预设视频类型;If the hash value is not greater than the threshold, determining that the video type of the video to be processed is a preset video type;
    若所述哈希值大于所述阈值,则确定所述待处理视频的视频类型不是所述预设视频类型。If the hash value is greater than the threshold, it is determined that the video type of the video to be processed is not the preset video type.
  14. 如权利要求12所述的编码设备,其特征在于,所述处理器在确定获取到的待处理视频的视频类型时,执行如下操作:The encoding device according to claim 12, wherein the processor performs the following operations when determining the acquired video type of the to-be-processed video:
    调用视频类型识别模型对所述待处理视频进行识别,得到识别结果;Calling a video type recognition model to recognize the to-be-processed video, and obtain a recognition result;
    若所述识别结果所指示的视频类型为预设视频类型,则确定所述待处理视频的视频类型为预设视频类型。If the video type indicated by the recognition result is a preset video type, it is determined that the video type of the video to be processed is the preset video type.
  15. 如权利要求12所述的编码设备,其特征在于,所述初始像素精度集合包括第一初始像素精度集合和第二初始像素精度集合,所述目标像素精度集合包括第一目标像素精度集合和第二目标像素精度集合,The encoding device of claim 12, wherein the initial pixel accuracy set includes a first initial pixel accuracy set and a second initial pixel accuracy set, and the target pixel accuracy set includes a first target pixel accuracy set and a second initial pixel accuracy set. Two target pixel accuracy sets,
    所述处理器在将初始像素精度集合中各个像素精度值进行修改,得到目标像素精度集合时,执行如下操作:When the processor modifies each pixel precision value in the initial pixel precision set to obtain the target pixel precision set, it performs the following operations:
    获取对所述待处理视频进行编码处理时使用的编码模式;Acquiring an encoding mode used when encoding the to-be-processed video;
    若所述编码模式为第一类编码模式,则将所述第一初始像素精度集合中各个像素精度值进行修改,得到第一目标像素精度集合;If the coding mode is the first type of coding mode, modify each pixel precision value in the first initial pixel precision set to obtain a first target pixel precision set;
    若所述编码模式为第二类编码模式,则将所述第二初始像素精度集合中各 个像素精度值进行修改,得到第二目标像素精度集合。If the coding mode is the second type of coding mode, each pixel precision value in the second initial pixel precision set is modified to obtain a second target pixel precision set.
  16. 如权利要求15所述的编码设备,其特征在于,所述第一类编码模式包括帧间inter编码模式和仿射affine编码模式中的任意一种,所述第二类编码模式包括所述inter编码模式和所述affine编码模式中的另外一种。The encoding device according to claim 15, wherein the first type of encoding mode includes any one of an inter encoding mode and an affine encoding mode, and the second type of encoding mode includes the inter The other of the encoding mode and the affine encoding mode.
  17. 如权利要求12所述的编码设备,其特征在于,所述处理器在基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频时,执行如下操作:The encoding device according to claim 12, wherein the processor performs the following operations when encoding the video to be processed based on the target pixel accuracy set to obtain the encoded video:
    为所述编码视频添加标识信息;Adding identification information to the encoded video;
    将添加了所述标识信息的编码视频发送给解码端,所述标识信息用于指示所述解码端基于所述标识信息对所述编码视频进行解码。Send the encoded video to which the identification information is added to the decoding end, where the identification information is used to instruct the decoding end to decode the encoded video based on the identification information.
  18. 如权利要求12所述的编码设备,其特征在于,所述处理器在在基于所述目标像素精度集合对所述待处理视频进行编码处理,得到编码视频时,执行如下操作:The encoding device according to claim 12, wherein the processor performs the following operations when encoding the video to be processed based on the target pixel accuracy set to obtain the encoded video:
    为目标像素精度集合中包括的各个像素精度设置索引标识;Set an index mark for each pixel precision included in the target pixel precision set;
    确定对所述待处理视频进行编码处理时使用的所述目标像素精度集合中的目标像素精度,以及所述目标像素精度对应的索引标识;Determining the target pixel accuracy in the target pixel accuracy set used when encoding the video to be processed, and the index identifier corresponding to the target pixel accuracy;
    为所述编码视频添加所述索引标识,并将添加了所述索引标识的编码视频发送给解码端,所述索引标识用于指示所述解码端基于所述索引标识对编码视频进行解码。The index identifier is added to the encoded video, and the encoded video added with the index identifier is sent to a decoding end, where the index identifier is used to instruct the decoding end to decode the encoded video based on the index identifier.
  19. 一种解码设备,其特征在于,包括存储器和处理器:A decoding device, characterized in that it comprises a memory and a processor:
    所述存储器,用于存储程序代码;The memory is used to store program code;
    所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行如下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
    接收编码视频;Receive encoded video;
    当所述编码视频中包括标识信息时,确定所述编码视频对应的视频类型为 预设视频类型;When the coded video includes identification information, determining that the video type corresponding to the coded video is a preset video type;
    基于目标像素精度集合对所述编码视频进行解码处理;Decoding the encoded video based on the target pixel accuracy set;
    所述目标像素精度集合是针对初始像素精度集合中各个像素精度值进行修改获得的。The target pixel accuracy set is obtained by modifying each pixel accuracy value in the initial pixel accuracy set.
  20. 如权利要求19所述的解码设备,其特征在于,所述编码视频中还包括索引标识,所述处理器在基于目标像素精度集合对所述编码视频进行解码处理时,执行如下操作:The decoding device according to claim 19, wherein the coded video further includes an index identifier, and the processor performs the following operations when decoding the coded video based on the target pixel accuracy set:
    从所述目标像素精度集合中,确定所述索引标识所标识的目标像素精度;From the target pixel accuracy set, determine the target pixel accuracy identified by the index identifier;
    基于所述目标像素精度对所述编码视频进行解码处理。Performing decoding processing on the encoded video based on the target pixel accuracy.
  21. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有第一计算机程序,所述第一计算机程序包括的第一程序指令,所述第一程序指令当被处理器执行时使所述处理器执行如权利要求1-7任一项所述的视频处理方法;或者,所述计算机可读存储介质存储有第二计算机程序,所述第二计算机程序包括的第二程序指令,所述第二程序指令当被处理器执行时使所述处理器执行如权利要求8或9所述的视频处理方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a first computer program, the first computer program includes first program instructions, and the first program instructions are executed by a processor When making the processor execute the video processing method according to any one of claims 1-7; or, the computer-readable storage medium stores a second computer program, and the second computer program includes a second program Instruction, the second program instruction when executed by the processor causes the processor to execute the video processing method according to claim 8 or 9.
PCT/CN2019/078050 2019-03-13 2019-03-13 Video processing method and device, encoding apparatus, and decoding apparatus WO2020181540A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/078050 WO2020181540A1 (en) 2019-03-13 2019-03-13 Video processing method and device, encoding apparatus, and decoding apparatus
CN201980005058.6A CN111567044A (en) 2019-03-13 2019-03-13 Video processing method and device, coding equipment and decoding equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/078050 WO2020181540A1 (en) 2019-03-13 2019-03-13 Video processing method and device, encoding apparatus, and decoding apparatus

Publications (1)

Publication Number Publication Date
WO2020181540A1 true WO2020181540A1 (en) 2020-09-17

Family

ID=72075486

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/078050 WO2020181540A1 (en) 2019-03-13 2019-03-13 Video processing method and device, encoding apparatus, and decoding apparatus

Country Status (2)

Country Link
CN (1) CN111567044A (en)
WO (1) WO2020181540A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105981382A (en) * 2014-09-30 2016-09-28 微软技术许可有限责任公司 Hash-Based Encoder Decisions For Video Coding
CN106331703A (en) * 2015-07-03 2017-01-11 华为技术有限公司 Video coding and decoding method, and video coding and decoding device
CN107005708A (en) * 2014-09-26 2017-08-01 Vid拓展公司 Decoding is replicated in the block of use time block vector forecasting
WO2019009504A1 (en) * 2017-07-07 2019-01-10 삼성전자 주식회사 Apparatus and method for encoding motion vector determined using adaptive motion vector resolution, and apparatus and method for decoding motion vector
CN109391814A (en) * 2017-08-11 2019-02-26 华为技术有限公司 Encoding video pictures and decoded method, device and equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8411756B2 (en) * 2009-05-21 2013-04-02 Ecole De Technologie Superieure Method and system for generating block mode conversion table for efficient video transcoding
US10178406B2 (en) * 2009-11-06 2019-01-08 Qualcomm Incorporated Control of video encoding based on one or more video capture parameters
US20140126644A1 (en) * 2011-06-30 2014-05-08 Telefonaktiebolaget L M Ericsson (Publ) A Method a Decoder and Encoder for Processing a Motion Vector
CN103260020A (en) * 2012-02-18 2013-08-21 张新安 Quick integer pixel motion estimation method of AVS-M video coding
US9749642B2 (en) * 2014-01-08 2017-08-29 Microsoft Technology Licensing, Llc Selection of motion vector precision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107005708A (en) * 2014-09-26 2017-08-01 Vid拓展公司 Decoding is replicated in the block of use time block vector forecasting
CN105981382A (en) * 2014-09-30 2016-09-28 微软技术许可有限责任公司 Hash-Based Encoder Decisions For Video Coding
CN106331703A (en) * 2015-07-03 2017-01-11 华为技术有限公司 Video coding and decoding method, and video coding and decoding device
WO2019009504A1 (en) * 2017-07-07 2019-01-10 삼성전자 주식회사 Apparatus and method for encoding motion vector determined using adaptive motion vector resolution, and apparatus and method for decoding motion vector
CN109391814A (en) * 2017-08-11 2019-02-26 华为技术有限公司 Encoding video pictures and decoded method, device and equipment

Also Published As

Publication number Publication date
CN111567044A (en) 2020-08-21

Similar Documents

Publication Publication Date Title
US11412228B2 (en) Method and apparatus for video encoding and decoding
US20200267416A1 (en) Image processor and image processing method
WO2020253858A1 (en) An encoder, a decoder and corresponding methods
US9414086B2 (en) Partial frame utilization in video codecs
JP5766877B2 (en) Frame coding selection based on similarity, visual quality, and interest
US20220188976A1 (en) Image processing method and apparatus
EP3813371A1 (en) Video encoding/decoding method and apparatus, computer device, and storage medium
CN112995663B (en) Video coding method, video decoding method and corresponding devices
EP4254964A1 (en) Image processing method and apparatus, device, and storage medium
US20210360275A1 (en) Inter prediction method and apparatus
WO2019128716A1 (en) Image prediction method, apparatus, and codec
WO2020088482A1 (en) Affine prediction mode-based inter-frame prediction method and related apparatus
WO2021057705A1 (en) Video encoding and decoding methods, and related apparatuses
CN113785573A (en) Encoder, decoder and corresponding methods using an adaptive loop filter
CN114466192A (en) Image/video super-resolution
KR102609215B1 (en) Video encoders, video decoders, and corresponding methods
CN115836527A (en) Encoder, decoder and corresponding methods for adaptive loop filtering
US20220046234A1 (en) Picture prediction method and apparatus, and computer-readable storage medium
CA3137980A1 (en) Picture prediction method and apparatus, and computer-readable storage medium
CN116848843A (en) Switchable dense motion vector field interpolation
CN114830665A (en) Affine motion model restriction
WO2023093768A1 (en) Image processing method and apparatus
EP3910955A1 (en) Inter-frame prediction method and device
WO2023011420A1 (en) Encoding method and apparatus, and decoding method and apparatus
WO2020181540A1 (en) Video processing method and device, encoding apparatus, and decoding apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919370

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19919370

Country of ref document: EP

Kind code of ref document: A1