WO2020108033A1 - 转码方法、转码装置和计算机可读存储介质 - Google Patents
转码方法、转码装置和计算机可读存储介质 Download PDFInfo
- Publication number
- WO2020108033A1 WO2020108033A1 PCT/CN2019/106804 CN2019106804W WO2020108033A1 WO 2020108033 A1 WO2020108033 A1 WO 2020108033A1 CN 2019106804 W CN2019106804 W CN 2019106804W WO 2020108033 A1 WO2020108033 A1 WO 2020108033A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- video
- encoding
- unit
- coding unit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 244000309469 Human enteric coronavirus Species 0.000 claims description 8
- 230000001010 compromised effect Effects 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 19
- 238000004891 communication Methods 0.000 description 14
- 238000012545 processing Methods 0.000 description 12
- 238000013507 mapping Methods 0.000 description 11
- 229920000069 polyphenylene sulfide Polymers 0.000 description 9
- 238000013139 quantization Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/467—Embedding additional information in the video signal during the compression process characterised by the embedded information being invisible, e.g. watermarking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/835—Generation of protective data, e.g. certificates
- H04N21/8358—Generation of protective data, e.g. certificates involving watermark
Definitions
- This application belongs to the field of computer software applications, especially transcoding methods, transcoding devices, and computer-readable storage media.
- Transcoding is a process of decoding and then encoding the original compressed video stream.
- a more extensive application requirement is to add graphic information to a certain area of the encoded video (relative to the fixed or changed area of the display), such as: watermark pictures, subtitles, picture-in-picture, and magic that appears in the live broadcast Emoticons and stickers, etc.
- the transcoding method of full solution and full compilation in the related art mainly has the following disadvantages:
- the full solution and full editing method has a large amount of calculation, which makes the processor need to deal with a larger workload, and the encoding takes longer time;
- the encoders used in the initial and re-encoding may be different, or the encoding parameters used in the initial and re-encoding are different, resulting in the original video and the new video after transcoding
- the resolution, bit rate and other parameters are inconsistent, which leads to a reduction in the clarity of the transcoded new video image compared to the original video, or a decrease in the smooth performance of the encoded new video during playback, and the loss of video quality. problem.
- the present application discloses a transcoding method, a transcoding device, and a computer-readable storage medium.
- an embodiment of the present application provides a transcoding method, including:
- the encoder use the encoding information to encode the video frame sequence with the added graphic information to obtain a new video.
- an embodiment of the present application provides a transcoding device, including:
- An obtaining module configured to decode the original video by a decoder to obtain a video frame sequence of the original video and encoding information of the original video, wherein the original video is a video to which graphic information needs to be added;
- An adding module for adding the graphic information to the video frame sequence to obtain the video frame sequence after adding the graphic information
- the encoding module is used for encoding the video frame sequence with the added graphic information by the encoder to obtain a new video.
- an embodiment of the present application provides a transcoding device, including:
- Memory for storing processor executable instructions
- the processor is configured to perform any one of the transcoding methods described above.
- an embodiment of the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed, the transcoding method described in the first aspect is implemented.
- an embodiment of the present application provides a computer program product, including a computer program, the computer program includes program instructions, and when the program instructions are executed by an electronic device, the electronic device is caused to perform any of the above Code method.
- the transcoding method provided by the embodiment of the present application decodes the original video by the decoder to obtain the video frame sequence of the original video and the encoding information of the original video.
- the encoding information is easily and quickly obtained during the decoding process; the encoder uses the encoding
- the information encodes the video frame sequence after adding the graphic information to obtain a new video, which reduces the time spent in calculating the encoding decision and ensures the consistency of the new video and the original video in information such as resolution, code rate and frame rate
- the quality of the new video is greatly improved, thereby alleviating the technical problems of traditional transcoding methods that take a long time and the quality is vulnerable.
- FIG 1 shows the principle diagram of the conventional transcoding method
- FIG. 2 is a flowchart of a transcoding method provided in Embodiment 1;
- FIG. 3 is a flowchart of a method for encoding a video frame sequence added with graphic information by using encoding information according to Embodiment 1;
- FIG. 4 is a flowchart of a method for encoding a second basic coding unit using slice information and coding unit information according to Embodiment 1;
- FIG. 5 shows a mapping relationship of encoded information in an exemplary embodiment
- FIG. 6 is a structural block diagram of a transcoding device provided in Embodiment 2.
- FIG. 7 is a structural block diagram of a transcoding device provided in Embodiment 3.
- FIG. 8 is a structural block diagram of another transcoding device provided in Embodiment 3.
- the transcoding method in the related art is to decode the source stream based on the original video into a video in the original video format, for example, YUV (Luminance, Chrominance, Chroma) Format; superimpose graphic information to a specific area in the video, and then encode again.
- Figure 1 shows the principle diagram of the conventional transcoding method. Referring to Figure 1, the process of conventional transcoding is as follows:
- the decoder decodes the source stream (compressed video stream) that needs to add graphic information into a sequence of video frames in YUV format;
- sequence of YUV video frames with graphic information enters the encoder and is encoded again to generate a transcoded stream to form a new video.
- This transcoding method requires full decoding and full coding, which requires decoding all source video streams and encoding all decoded video streams, which is relatively time-consuming.
- GOP Group of Pictures
- the GOP structures of different video streams are not the same.
- the GOP structure of the source stream is difficult to judge in the program. Because GOP has different situations such as lengthening and shortening, it is more difficult to differentiate different video streams, and the same encoding is used uniformly. Transcoding the parameters destroys the GOP structure of the source stream.
- the length of the GOP determines the size of the image frame delay. Therefore, changes in the GOP structure will cause the image frame delay to change.
- the encoder takes into account the different importance of different frame types when encoding.
- I frames are usually assigned a smaller quantization width (QP parameter) to retain higher image quality
- P Frames are second
- B frames are assigned larger QP parameters and have the relatively worst image quality.
- QP parameter quantization width
- the change in the video GOP structure makes it possible for the same frame of the new video and the original video to use different frame types.
- the I frame of the source stream may become a P frame or even a B frame, and the original P/B frame may be used as an I frame by the transcoded stream, thereby compromising the overall quality of the video stream.
- bit rate information is not declared in the video header of the HEVC standard, and it is a non-standard data.
- bit rate data may be stored in metadata in mp4 format.
- metadata in many bitstreams does not have video bit rate data or the video bit rate value is incorrect. Therefore, metadata cannot provide a reliable bit rate .
- code streams are currently coded, most of them use ABR (available bit-rate) code rate control method.
- the code rate changes in real time. In this case, the source stream code rate is monitored and the encoder is notified. It is also very difficult to make changes in real time. At the same time, because the GOP structure will be changed, the same frame of the new video and the original video may use different frame types, making it impossible to keep the bit rate of the transcode stream and the source stream consistent at all times.
- the transcoding method of full editing and full solution will lead to the inconsistency of the resolution and bit rate of the original video and the new video, which will reduce the clarity of the new video image or reduce the smoothness of the video and the video quality.
- the problem will lead to the inconsistency of the resolution and bit rate of the original video and the new video, which will reduce the clarity of the new video image or reduce the smoothness of the video and the video quality.
- the embodiments of the present application provide a transcoding method, a transcoding device, and a computer-readable storage medium, to solve the technical problem that the full-transcoding and full-coding transcoding method takes a long time and the quality is easily damaged.
- Embodiment 1 of the present application provides a transcoding method, as shown in FIG. 2, including:
- Step S102 Decode the original video by the decoder to obtain the video frame sequence of the original video and the encoding information of the original video, where the original video is a video that needs to be added with graphic information;
- Step S104 Add graphic information to the video frame sequence to obtain the video frame sequence after adding the graphic information
- step S106 the encoder uses the encoding information to encode the video frame sequence after adding the graphic information to obtain a new video.
- the encoder and the decoder are two different functional modules, wherein the encoder is used to encode the video frame sequence, and the decoder is used to decode the original video.
- the encoder and the decoder may be two separate devices, or two functional modules integrated in one device casing.
- the embodiment of the present application does not limit the encoder and the decoder.
- the graphic information that needs to be added to the video frame sequence of the original video includes, but is not limited to, picture watermark, audio watermark, subtitle, barrage, picture-in-picture, sticker, and magic expression.
- step S102 the original video is decoded by the decoder to obtain the video frame sequence and encoding information of the original video.
- the video frame sequence and coding information can be stored in different locations of the original video, and the decoder can obtain the video frame sequence and coding information of the original video based on one or more parsing processes.
- the encoding information of the original video includes frame information, slice information, and coding unit information of the original video, where the frame information is video feature data of the image frame of the original video, such as basic feature data such as video width and height, and the slice information is The coding parameters of each slice of the original video, and the coding unit information are the coding parameters of the first basic coding unit constituting each image frame of the original video.
- Each image frame can be divided into multiple slices, and each slice can be divided into multiple basic coding units.
- the storage structure of frame information, slice information, and coding unit information and the storage location in the original video may be different, and the terms used to represent the basic coding unit may be different.
- the frame information of the HECV standard is the characteristic data of the image frame, and the basic coding unit is the coding tree unit.
- the frame information is stored in the video header information of the original video.
- the slice information is the information of the slices constituting the image frame of the original video, and the coding parameters of the first slice of the first image frame may be used.
- the coding unit information is the coding parameter of the coding tree unit constituting the image frame of the original video, and the coding parameter of the first coding tree unit in the first slice may be used.
- the video header information is the most important video information as the video feature data of the original video. It contains basic feature data such as the width and height of the original video. These data are usually used when the encoder is initialized;
- Slice information is the header information of a slice.
- a slice is a high-efficiency video coding (HEVC) image division unit.
- HEVC high-efficiency video coding
- a frame of image can be divided into multiple slices, or as a slice, many
- a frame image is encoded as a slice.
- the slice header information contains some encoding parameters used by the slice to configure the encoding implementation.
- Coding Tree Unit is the basic unit of HEVC video coding.
- the CTU size can be from 8x8 to 64x64, and one slice can include one or more CTUs.
- the coding tree unit information is coding parameters used by the coding tree unit.
- the video header information, slice information, and coding tree unit information constitute the encoding information of the original video of the HECV standard, which is more conveniently obtained during the analysis of the original video.
- the video header information, slice information and coding tree unit information fully describe the parameters of the HECV standard original video encoding process by the encoder, so that in step S106, the encoder uses the encoding information to add graphic information After the video frame sequence is encoded, the new video and the original video maintain good consistency, ensuring that the quality of the new video will not be compromised.
- the video header information may include: a video parameter set (Video Paramater Set, referred to as VPS), a sequence parameter set Sequence, Parameter Set, referred to as SPS, and a picture parameter set (Picture Paramater Set, referred to as PPS).
- VPS Video Paramater Set
- SPS Sequence parameter set
- PPS Picture Paramater Set
- the encoder can also refer to data such as the video parameter set, sequence parameter set, and image parameter set, which can better restore the video characteristics of the original video.
- the PPS includes different setting information for each frame of image, and the setting information mainly includes: self-citation information, initial image control information (such as initial QP), and block information.
- the setting information mainly includes: self-citation information, initial image control information (such as initial QP), and block information.
- initial image control information such as initial QP
- block information At the beginning of decoding, all PPSs are all inactive, and at any time during decoding, at most one PPS can be active.
- the PPS is activated, called the active PPS, until another PPS is activated.
- SPS provides the information required by all slices in the video sequence.
- the content of SPS can include: decoding related information, such as grade level, resolution, number of sub-layers, etc.; function switch identification and parameter of a function in a certain grade; structure and transformation Limited information on the flexibility of coefficient coding; time-domain gradable information.
- VPS is used to explain the overall structure of the encoded video sequence, including the time-domain sub-layer dependencies.
- the main purpose of adding this structure in HEVC is to expand the compatible standard in the multi-sublayer of the system. For a certain sublayer of a given video sequence, no matter the SPS phase is different, they all share a VPS.
- the main information contained in VPS is: syntax elements shared by multiple sub-layers or operation points; session key information such as grade and level; other operation point specific information that is not part of SPS.
- VP9 does not have video header information such as VSP/SPS/PPS, but only header information at the image frame level.
- Each image frame will have uncompressed header and compressed header information. Therefore, the corresponding frame information is stored in the uncompressed header.
- uncompressed header also contains some other information, such as some information in sps, pps and slice information.
- the compressed header is the probability table used for entropy coding of each syntax element of the current frame. Therefore, for VP9, the encoding information that can be obtained from the header information of the image frame includes frame information and slice information.
- the frame information is basic feature information of the video image.
- the VP9 standard adopts the coding hierarchy of image frames/slices/superblocks/blocks.
- the image frame can be divided into 64x64 superblocks, and the division of slices is based on the boundaries of superblocks. It has been declared in uncompressed header.
- a super block whose English name is super block, or SB for short, is the basic coding unit of VP9 video coding.
- Each SB can be recursively divided into blocks in the form of a quadtree.
- the coding parameters of the super block such as the SB division method, the block coding mode, the motion vector mv, the quantizer, etc., are used as the coding unit information.
- step S106 using the encoding information to encode the video frame sequence after adding the graphic information includes:
- Step S301 Obtain frame information, slice information, and coding unit information from the decoder, and obtain a video frame sequence after adding graphic information.
- step S302 the frame information is used to initialize the encoder.
- the frame information represents the basic feature information of the original video.
- the frame information is used to initialize the encoder, so that the new video and the original video maintain consistency in the configuration parameters of the encoder used.
- Step S303 Divide the video frame sequence with the added graphic information into the second basic coding unit.
- the basic coding unit is a coding tree unit, and in VP9 the basic coding unit is a super block.
- the first basic coding unit and the second basic coding unit are only used to distinguish two different basic coding units.
- each frame in the video frame sequence with added graphic information is divided into basic coding units of fixed size in raster scan order (from left to right, then from top to bottom).
- Step S304 Encode the second basic coding unit according to the slice information and the coding unit information through the initialized encoder.
- each coding tree unit can be recursively divided into multi-level coding units (Coding Units, CU for short) in the form of a quadtree.
- each super block can be recursively divided into multi-level blocks in the form of a quadtree.
- the coding unit information includes the CU depth and the division method in the process of dividing the CTU into CU.
- the coding unit information is used to divide the second coding tree unit into coding units, so that the division of the coding unit is consistent with the division of the coding unit in the original video coding process.
- intra-frame and inter-frame prediction Discrete Cosine Transform (DCT) and quantization are performed in units of CU, then run-length scan is performed on the transformed and quantized residual coefficients, and finally Entropy coding to complete the coding process.
- DCT Discrete Cosine Transform
- the slice information includes frame display order, reference frame number and reference data set information, etc.
- the coding tree unit information includes, CU depth and division method, coding mode, quantization parameter QP, sample adaptive compensation (Sample Adaptive Offset, referred to as SAO) ) Parameters etc. Encoding the second coding tree unit using slice information and coding tree unit information ensures that the new video and the original video maintain consistency in coding unit coding.
- the coding unit information includes the block depth and the division method in the process of dividing the super block into blocks.
- the coding unit information is used to divide the second super block into multiple blocks, so that the division of the block is consistent with the division of the block in the original video coding process.
- intra-frame and inter-frame prediction Discrete Cosine Transform (DCT) and quantization are performed in units of blocks, and then run-length scan is performed on the transformed and quantized residual coefficients, and finally Entropy coding is performed to complete the coding process.
- the slice information includes frame display order, reference frame number and reference data set information, etc.
- the coding tree unit information includes, CU depth and division method, coding mode, quantization parameter QP, sample adaptive compensation (Sample Adaptive Offset, referred to as SAO) ) Parameters etc. Encoding the second coding tree unit using slice information and coding tree unit information ensures that the new video and the original video maintain consistency in coding unit coding.
- the configuration parameters of the encoder used by the new video and the original video are kept consistent, the division of the coding unit is kept consistent, and the coding of the coding unit is kept consistent, thereby making the new video and the original video consistent
- the video is consistent in terms of video quality, alleviating the technical problems of impaired video quality.
- step S304 encoding the second coding tree unit according to the slice information and the coding tree unit information includes:
- Step S401 Acquire position information of each second basic coding unit
- Step S402 based on the position information, determine whether the current second basic coding unit is related to the coverage area of the graphic information, and obtain a judgment result;
- the judgment result of this step is: the current second basic coding unit is related to the coverage area of the graphic information, or the current second basic coding unit is not related to the coverage area of the graphic information.
- Step S403 According to the judgment result, determine whether to encode the second basic coding unit using slice information and coding unit information.
- the current second coding tree unit has nothing to do with the coverage area of the graphic information, it means that adding the graphic information does not change the current second basic coding unit, so the current second basic coding unit remains unchanged, in this case, the current second basic coding The coding decision of the unit remains unchanged, thereby maintaining consistency with the original video quality.
- the current second coding tree unit When the current second coding tree unit is related to the area covered by the teletext information, it means that the teletext information is added and the current second basic coding unit is changed, so the current second basic coding unit changes.
- the second basic The encoding unit performs the encoding decision used for encoding, and uses the newly determined encoding decision to encode the second encoding unit.
- the relationship between the second basic coding unit and the coverage area of the graphic information is used to determine whether to encode the second basic coding unit using slice information and coding unit information, and the second basic coding is fully considered in the coverage area
- the influence of the unit makes the coding decision of the second basic coding unit more reasonable and scientific.
- step S402 determines whether the current second coding tree unit is related to the coverage area of the graphic information, including:
- the first condition is that the current second basic coding unit is located in the area covered by the graphic information
- the second condition is that the current second basic coding unit is an inter mode and meets any of the following conditions: the image of the coverage area is referenced, and the video motion vector prediction is affected by the target coding tree unit, where the target coding tree unit is the current second
- the second basic coding unit adjacent to the basic coding unit has been determined to be related to the coverage area.
- step S402 When the current second basic coding unit satisfies any one of the first condition and the second condition, the judgment result of step S402 is: the current second basic coding unit is related to the coverage area of the graphic information; When the coding unit neither satisfies the first condition nor the second condition, the judgment result of step S402 is: the current second basic coding unit has nothing to do with the coverage area of the graphic information.
- step S403 it is determined whether to encode the second basic coding unit using slice information and coding unit information, further including:
- the second basic coding unit is coded using slice information and coding unit information.
- the coding decision used for coding the second basic coding unit is re-determined, and the second coding is re-determined using the re-determined coding decision Unit coding.
- the coding decision used for coding the second basic coding unit is newly determined, which specifically includes determining the CU or block depth and Division method, coding method, etc.
- the second basic coding unit when the current second basic coding unit is not related to the coverage area of the graphic information, the second basic coding unit is encoded using slice information and coding unit information, that is, there is no need to perform the current second coding unit Calculation of coding decisions. Since the second basic coding unit irrelevant to the coverage area occupies a large proportion in general, the calculation amount in the encoder encoding process is greatly reduced, the processor load is reduced, the transcoding is accelerated, and the traditional The technical problem of time-consuming transcoding method.
- the encoder and the decoder are communicatively connected so that the encoder obtains encoding information from the decoder, where,
- the decoder transmits the encoding information to the encoder in the first data structure and the first data arrangement manner;
- the encoder receives the encoded information from the decoder in a second data structure and a second data arrangement, where,
- the second data structure is the same as the first data structure, and the second data arrangement is the same as the first data arrangement.
- the encoder and the decoder are connected in communication, and the encoder and the decoder implement the transmission of the encoded information with the same data structure and the same data arrangement, ensuring that the encoded information is between the encoder and the decoder Fast and accurate delivery.
- the encoder and the decoder are communicatively connected so that the encoder obtains encoding information from the decoder, where,
- the decoder transmits encoding information to the encoder in a third data structure and a third data arrangement manner
- the encoder After receiving the encoded information, the encoder stores the encoded information according to the fourth data structure and the fourth data arrangement according to the mapping relationship, where,
- the fourth data structure is different from the third data structure, and/or, the fourth data arrangement is different from the third data arrangement;
- the mapping relationship is the correspondence between the first position and the second position.
- the first position is the position of the encoded information in the third data structure and the third data arrangement
- the second position is the encoded information in the fourth data structure and the third position. Four positions in the data arrangement.
- the mapping relationship is shown by the arrow in Figure 5.
- the encoder and decoder save the encoding information of the quantization parameter: maybe due to the problem of the calculation method, one uses the number of coding units in the horizontal direction of the image as the unit line width of the data, and the other uses the coding units of the horizontal direction of the image
- the number plus 1 is the unit line width of the data, that is, the encoder and decoder are different in data structure.
- the encoder maps the quantization parameter array in the third data structure to the line by line according to the mapping relationship. Quantization parameter array of the fourth data structure and store.
- the encoder and decoder store the encoding information of the motion vector information: one uses the smallest prediction unit as the storage unit, and stores the motion vector information in the raster scan order on the entire image; the other is to first store it in a coding tree unit Internally, the smallest prediction unit is used as the storage unit, and the motion vector information is stored in raster scan order to form multiple coding tree units, and then each coding tree unit is stored in raster scan order on the entire image, that is, the encoder and decoder are arranged in data The way is different. When they are in data communication, the encoder converts the coordinates of a certain minimum prediction unit in the third data arrangement to the coordinates in the fourth data arrangement according to the mapping relationship, to complete the decoder to the encoder Data communication and acquisition.
- the multiple encoding information may include the following three situations at the same time:
- the fourth data structure is different from the third data structure, and the fourth data arrangement is the same as the third data arrangement;
- the fourth data structure is the same as the third data structure, and the fourth data arrangement and the third data arrangement are different.
- the multiple encoding information may also include any one or any two of the above three situations.
- each of the multiple coding information has a corresponding mapping relationship
- the encoder according to the mapping relationship corresponding to the coding information
- the coding information according to the fourth data The structure and the fourth data arrangement are stored.
- the encoder achieves the purpose of orderly obtaining the encoded information from the decoder through the mapping relationship, and is particularly suitable for the decoder and the Different situations for encoder developers.
- Embodiment 2 of the present application provides a transcoding device, as shown in FIG. 6, including:
- the obtaining module 100 is configured to decode the original video by a decoder to obtain the video frame sequence of the original video and the encoding information of the original video, where the original video is a video to which graphic information needs to be added;
- the adding module 200 is used to add the graphic information to the video frame sequence to obtain the video frame sequence after adding the graphic information
- the encoding module 300 is configured to use the encoding information to encode the video frame sequence with the added graphic information through the encoder to obtain a new video.
- An embodiment of the present application provides a transcoding device.
- the decoder obtains encoding information during the decoding process, and the encoding information is easily and quickly obtained; the encoder encodes the video frame sequence after adding the graphic information based on the encoding information, reducing the cost of calculating the encoding decision Time, and ensure the consistency of the new video and the original video in information such as resolution, code rate and frame rate, greatly improve the picture quality of the new video, thus alleviating the time-consuming and quality-prone traditional transcoding methods technical problem.
- the obtaining module 100 decodes the original video through the decoder to obtain the video frame sequence and encoding information of the original video.
- the video frame sequence and coding information can be stored in different locations of the original video, and the decoder can obtain the video frame sequence and coding information of the original video based on one or more parsing processes.
- the encoding information of the original video includes frame information, slice information, and coding unit information of the original video, where the frame information is video feature data of the image frame of the original video, such as basic feature data such as video width and height, and the slice
- the information is the coding parameters of each slice of the original video
- the coding unit information is the coding parameters of the first basic coding unit that constitutes each image frame of the original video.
- Each image frame can be divided into multiple slices, and each slice can be divided into multiple basic coding units.
- the storage structure of frame information, slice information, and coding unit information and the storage location in the original video may be different, and the terms used to represent the basic coding unit may be different.
- the obtaining module is specifically used to:
- the original video is parsed to obtain encoding information of the original video.
- the encoding information of the original video includes: video header information, slice information, and coding tree unit information of the original video, where the video header information is video feature data of the original video, and the slice information is The coding parameters of the first slice and the coding tree unit information are the coding parameters of the first coding tree unit, and the first slice and the first coding tree unit belong to the original video.
- the video header information includes: a video parameter set, a sequence parameter set, and an image parameter set.
- the encoding module is specifically used for:
- a second obtaining unit configured to obtain the frame information, slice information and coding unit information of the original video from the encoding information of the original video, and obtain the video frame sequence after adding the graphic information;
- the second basic encoding unit is encoded according to the slice information and the encoding unit information.
- the coding unit is specifically used for:
- the coding unit is specifically used for:
- the first condition is that the current second basic coding unit is located in the area covered by the graphic information
- the second condition is that the current second basic coding unit is an inter mode and meets any of the following conditions: reference to the image of the coverage area, video motion vector prediction is affected by a target coding tree unit, wherein The tree unit is the second coding tree unit adjacent to the second coding tree unit that has been judged to be related to the coverage area.
- the coding unit is specifically used for:
- the second basic coding unit is coded using the slice information and the coding unit information.
- the encoder and the decoder are connected in communication, so that the encoder obtains encoding information from the decoder, where,
- the decoder transmits the encoding information to the encoder in the first data structure and the first data arrangement manner;
- the encoder receives the encoded information from the decoder in a second data structure and a second data arrangement, where,
- the second data structure is the same as the first data structure, and the second data arrangement is the same as the first data arrangement.
- the encoder and the decoder are communicatively connected so that the encoder obtains encoding information from the decoder, where,
- the decoder transmits encoding information to the encoder in a third data structure and a third data arrangement manner
- the encoder After receiving the encoded information, the encoder stores the encoded information according to the fourth data structure and the fourth data arrangement according to the mapping relationship, where,
- the fourth data structure is different from the third data structure, and/or, the fourth data arrangement is different from the third data arrangement;
- the mapping relationship is the correspondence between the first position and the second position.
- the first position is the position of the encoded information in the third data structure and the third data arrangement
- the second position is the encoded information in the fourth data structure and the third position. Four positions in the data arrangement.
- Embodiment 3 of the present application provides a transcoding device, including:
- Memory for storing processor executable instructions
- the processor is configured to execute the transcoding method of the first embodiment.
- the processor is configured to perform the transcoding method of Embodiment 1, that is, the original video is decoded by the decoder to obtain the video frame sequence of the original video and the encoding information of the original video, where the original video is The video that needs to add graphic information; add the graphic information to the video frame sequence to obtain the video frame sequence after adding the graphic information; through the encoder, use the coding information to encode the video frame sequence after adding the graphic information to get a new video.
- the decoder obtains the coding information during the decoding process, and the coding information is obtained conveniently and quickly; based on the coding information, the video frame sequence after adding the graphic information is encoded, which reduces the time spent in calculating the coding decision and ensures the new video and
- the consistency of the original video in information such as resolution, code rate, and frame rate greatly improves the image quality of the new video, thereby alleviating the technical problems of traditional transcoding methods that take a long time and the quality is vulnerable.
- the transcoding device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, and a sensor component 614 , ⁇ 616.
- the processing component 602 generally controls the overall operations of the transcoding device 600, such as operations associated with display, data communication, and recording operations.
- the processing component 602 may include one or more processors 620 to execute instructions to complete all or part of the steps in the above method.
- the processing component 602 may include one or more modules to facilitate interaction between the processing component 602 and other components.
- the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.
- the memory 604 is configured to store various types of data to support operation at the transcoding device 600. Examples of these data include instructions for any application or method operating on the device 600, contact data, phone book data, messages, pictures, videos, and so on.
- the memory 604 may be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable and removable Programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM erasable and removable Programmable read only memory
- PROM programmable read only memory
- ROM read only memory
- magnetic memory flash memory
- flash memory magnetic disk or optical disk.
- the power supply component 606 provides power to various components of the device 600.
- the power supply component 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the transcoding device 600.
- the multimedia component 608 includes a screen between the device 600 and the user that provides an output interface.
- the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundary of the touch or sliding action, but also detect the duration and pressure related to the touch or sliding operation.
- the multimedia component 608 includes a front camera and/or a rear camera. When the device 600 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 610 is configured to output and/or input audio signals.
- the audio component 610 includes a microphone (MIC).
- the microphone is configured to receive an external audio signal.
- the received audio signal may be further stored in the memory 604 or transmitted via the communication component 616.
- the audio component 610 further includes a speaker for outputting audio signals.
- the I/O interface 612 provides an interface between the processing component 602 and a peripheral interface module.
- the peripheral interface module may be a keyboard, a click wheel, or a button. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.
- the sensor component 614 includes one or more sensors for providing the device 600 with status assessments in various aspects.
- the sensor component 614 can detect the on/off state of the device 600, and the relative positioning of the components, such as the display and the keypad of the transcoding device 600, and the sensor component 614 can also detect the transcoding device 600 or the transcoding device 600.
- the position of the component changes, the presence or absence of user contact with the transcoding device 600, the orientation or acceleration/deceleration of the transcoding device 600, and the temperature of the transcoding device 600 change.
- the sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
- the sensor component 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor component 614 may further include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- the communication component 616 is configured to facilitate wired or wireless communication between the transcoding device 600 and other devices.
- the transcoding device 600 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof.
- the communication component 616 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
- the communication component 616 also includes a near field communication (NFC) module to facilitate short-range communication.
- the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- the transcoding device 600 may be used by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field A programmable gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component is implemented to perform the above method.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA field A programmable gate array
- controller microcontroller, microprocessor, or other electronic component is implemented to perform the above method.
- FIG. 8 is a structural block diagram of another transcoding device 700.
- the transcoding device 700 may be provided as a server.
- the transcoding device 700 includes a processing component 722, which further includes one or more processors, and memory resources represented by the memory 732, for storing instructions executable by the processing component 722, such as application programs.
- the application programs stored in the memory 732 may include one or more modules each corresponding to a set of instructions.
- the processing component 722 is configured to execute instructions to perform the above-mentioned information list display method.
- the transcoding device 700 may also include a power component 726 configured to perform power management of the transcoding device 700, a wired or wireless network interface 750 configured to connect the transcoding device 700 to the network, and an input/output (I/O ) Interface 758.
- the transcoding device 700 can operate based on an operating system stored in the memory 732, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
- Embodiment 4 of the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed, the transcoding method of Embodiment 1 is implemented.
- the computer-readable storage medium for example, the memory 604 including instructions, which can be executed by the processor 620 of the transcoding device 600 to complete the above method.
- the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, or the like.
- the transcoding method of the first embodiment when the computer instruction is executed, the transcoding method of the first embodiment is implemented, that is, the original video is decoded by the decoder to obtain the video frame sequence of the original video and the encoding information of the original video.
- the video that needs to add graphic information; add the graphic information to the video frame sequence to obtain the video frame sequence after adding the graphic information; through the encoder, use the coding information to encode the video frame sequence after adding the graphic information to get a new video.
- the decoder obtains the coding information during the decoding process, and the coding information is obtained conveniently and quickly; based on the coding information, the video frame sequence after adding the graphic information is encoded, which reduces the time spent in calculating the coding decision and ensures the new video and
- the consistency of the original video in information such as resolution, code rate, and frame rate greatly improves the image quality of the new video, thereby alleviating the technical problems of traditional transcoding methods that take a long time and the quality is vulnerable.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请是关于一种转码方法、转码装置和计算机可读存储介质。该转码方法包括:通过解码器对原视频进行解码,获得原视频的视频帧序列及原视频的编码信息,其中,原视频是需要添加图文信息的视频;给视频帧序列添加图文信息,得到添加图文信息后的视频帧序列;通过编码器,使用编码信息对添加图文信息后的视频帧序列进行编码,得到新视频。该转码方法中,编码器基于解码器解码过程中获得的编码信息,对添加图文后的视频帧序列进行编码,解决了全解全编的传统转码方式耗时长且质量易受损的技术问题。
Description
相关申请的交叉引用
本申请要求在2018年11月27日提交中国专利局、申请号为201811427461.1、申请名称为“转码方法、转码装置和计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请属于计算机软件应用领域,尤其是转码方法、转码装置和计算机可读存储介质。
在视频直播和点播应用中,为了满足各种应用需求,需要对原视频流做转码处理,转码是对原压缩视频流先解码后编码的一个过程。目前,比较广泛的应用需求是在已编码视频的某一区域(相对于显示屏固定或变化的区域)添加图文信息,例如有:水印图片、字幕、画中画以及出现在直播里的魔法表情和贴纸等等。
相关技术中,在对媒体文件进行转码时,首先将被压缩的原视频流解码为原始视频格式的视频文件,之后,叠加图文信息等到视频中的特定区域,之后,进行再次编码。该转码方法实际上是一种完全解码再完全编码的方法。发明人发现相关技术中的全解全编的转码方式,主要存在如下缺点:
第一、全解全编的方式,具有较大的计算量,使得处理器需要处理更大的工作量,并且编码耗费的时间更长;
第二,视频初次编码和再次编码两个过程中,初次编码和再次编码使用的编码器可能不同,或者,初次编码和再次编码所采用的编码参数不同,导致原视频和转码后的新视频的分辨率、码率等参数不一致,从而导致转码后的新视频图像清晰度相比原视频有所下降,或者导致编码后的新视频在播放时的流畅性能减弱,存在视频质量受损的问题。
发明内容
针对相关技术中存在的问题,本申请公开一种转码方法、转码装置和计算机可读存储介质。
第一方面,本申请实施例提供了一种转码方法,包括:
通过解码器对原视频进行解码,获得所述原视频的视频帧序列及所述原 视频的编码信息,其中,所述原视频是需要添加图文信息的视频;
给所述视频帧序列添加所述图文信息,得到添加图文信息后的视频帧序列;
通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,得到新视频。
第二方面,本申请实施例提供一种转码装置,包括:
获取模块,用于通过解码器对原视频进行解码,获得所述原视频的视频帧序列及所述原视频的编码信息,其中,所述原视频是需要添加图文信息的视频;
添加模块,用于给所述视频帧序列添加所述图文信息,得到添加图文信息后的视频帧序列;
编码模块,用于通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,得到新视频。
第三方面,本申请实施例提供一种转码装置,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为执行上述任意一项所述的转码方法。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令被执行时实现第一方面所述的转码方法。
第五方面,本申请实施例提供了提供计算机程序产品,包括计算机程序,所述计算机程序包括程序指令,当所述程序指令被电子设备执行时,使所述电子设备执行上述任意一项的转码方法。
本申请的实施例提供的技术方案可以包括以下有益效果:
本申请实施例提供的转码方法,通过解码器对原视频进行解码,获得原视频的视频帧序列及原视频的编码信息,解码过程中对编码信息的获取方便快捷;通过编码器,使用编码信息对添加图文信息后的视频帧序列进行编码,得到新视频,减少了计算编码决策所耗费的时间,且保证了新视频和原视频在分辨率、码率和帧率等信息上的一致性,大幅改善新视频的画质,从而缓解了传统转码方法耗时长且质量易受损的技术问题。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
图1所示为常规转码方式的原理图;
图2所示为实施例一提供的一种转码方法的流程图;
图3所示为实施例一提供的一种使用编码信息对添加图文信息后的视频帧序列进行编码的方法流程图;
图4所示为实施例一提供的一种使用片信息和编码单元信息,对第二基础编码单元进行编码的方法流程图;
图5所示为一示例性实施例中编码信息的映射关系;
图6所示为实施例二提供的一种转码装置的结构框图;
图7所示为实施例三提供的一种转码装置的结构框图;
图8所示为实施例三提供的另一种转码装置的结构框图。
这里将详细地对示例性实施例进行说明,其示例表示在附图中。目前,为了满足用户在已编码的视频中加图文信息的需求,相关技术中的转码方法是把基于原视频的源流解码为原始视频格式的视频,例如,YUV(Luminance、Chrominance、Chroma)格式;叠加图文信息到视频中的某个特定区域,然后进行再次编码。图1所示为常规转码方式的原理图,参照图1,常规转码的流程具体如下:
首先,解码器将需要加图文信息的源流(压缩视频流)解码为YUV格式的视频帧序列;
然后,在需要添加图文信息的每个YUV格式的视频帧上叠加图文信息,生成加图文信息后的YUV视频帧序列;
最后,加图文信息的YUV视频帧序列进入编码器,被再次编码,生成转码流,以形成新视频。
该转码方式全解全编的方式,需要对全部源视频流进行解码,及对全部的解码后视频流进行编码,较为耗时。
此外,上述全解全编的转码方式存在以下问题:
(1)视频画面组(Group of Pictures,简称GOP)结构的改变可能导致视频的图象帧延时等部分特性被改变。例如,由于视频编码器不同或者编码器采用的参数不同,不同视频流的GOP结构不尽相同。在转码端,源流的GOP结构是难以在程序里判断的,由于GOP存在变长和变短等不同情况,对不同的视频流做差异化处理存在较大的困难,而统一用相同的编码参数做转码则破坏了源流的GOP结构。
在很多实际应用里(如直播和点播的场景),GOP的长短决定了图像帧延时的大小,因而,GOP结构改变会导致图像帧延时的改变。
此外,编码器在编码时考虑到不同帧类型的重要性不同,为了提高视频 流的整体质量,通常会给I帧分配较小的量化宽度(QP参数),以保留较高的图像质量,P帧次之,B帧分配较大的QP参数而拥有相对最差的图像质量。视频GOP结构的改变,使得新视频和原视频的同一帧可能使用不同的帧类型。例如,源流的I帧可能变为P帧甚至B帧,原来的P/B帧可能被转码流用作I帧,从而损害视频流的整体质量。
(2)转码流与源流无法时刻码率保持一致。例如,码率信息在HEVC标准里不会在视频的header包头中声明,是一个非标准的数据。实际应用中,mp4格式的metadata元数据中可能会存有码率数据,然而很多码流中的metadata没有视频码率的数据,或者视频码率数值错误,因而,metadata无法提供一个可靠的码率。而且,由于目前很多码流编码时大多数都采用ABR(可用比特率,available bit-rate)的码率控制方式,码率是在实时变化的,这种情况下监控源流码率并通知编码器实时作出改变也是非常困难的。同时,由于GOP结构会被改变,新视频和原视频的同一帧可能使用不同的帧类型,使得转码流与源流的码率时刻保持一致也是不可能的。
综上,全编全解的转码方式会导致原视频和新视频的分辨率、码率等不一致,使得新视频图像清晰度有所下降,或者视频流畅性能有所减弱,存在视频质量受损的问题。
基于此,本申请实施例提供一种转码方法、转码装置和计算机可读存储介质,解决全解全编的转码方法耗时长且质量易受损的技术问题。
为便于对本申请实施例进行理解,下面结合附图和实施例,对本申请的具体实施方式作进一步详细描述。
本申请实施例一提供了一种转码方法,如图2所示,包括:
步骤S102,通过解码器对原视频进行解码,获得原视频的视频帧序列及原视频的编码信息,其中,原视频是需要添加图文信息的视频;
步骤S104,给视频帧序列添加图文信息,得到添加图文信息后的视频帧序列;
步骤S106,通过编码器,使用编码信息对添加图文信息后的视频帧序列进行编码,得到新视频。
需要说明的是,编码器和解码器为两个不同的功能模块,其中,编码器用于对视频帧序列进行编码,解码器用于对原视频进行解码。编码器和解码器可以是分离的两个器件,也可以是集成在一个设备外壳中的两个功能模块,本申请实施例对编码器和解码器不作限定。
需要说明的是,在本申请的实施方式中,需要向原视频的视频帧序列中添加的图文信息包括但不限于图片水印、音频水印、字幕、弹幕、画中画、贴纸、魔法表情。
其中,步骤S102通过解码器对原视频进行解码,获取原视频的视频帧序列和编码信息。视频帧序列和编码信息可以存储在原视频的不同位置,解码器可以基于一次或多次的解析过程获得原视频的视频帧序列和编码信息。
进一步地,原视频的编码信息包括原视频的帧信息、片信息和编码单元信息,其中,帧信息为所述原视频的图像帧的视频特征数据,例如视频宽高等基本特征数据,片信息为原视频的每个片的编码参数,编码单元信息为组成原视频的每个图像帧的第一基础编码单元的编码参数。每个图像帧可以分成多个片,每个片可以分成多个基础编码单元。在不同的编码标准中,帧信息、片信息和编码单元信息的存储结构和在原视频中的存储位置可能不同,用于表示基础编码单元的术语可能不同。
下面分别以HECV标准和VP9标准对上述实施例进行说明。
HECV标准的帧信息为图像帧的特征数据,基础编码单元为编码树单元。对于HECV标准,帧信息存储在原视频的视频头信息中。片信息为组成原视频的图像帧的片的信息,可以采用第一图像帧的第一片的编码参数。编码单元信息为组成原视频的图像帧的编码树单元的编码参数,可以采用第一片中的第一编码树单元的编码参数。
需要说明的是:
(1)视频头信息作为原视频的视频特征数据,是最重要的视频信息,包含原视频宽高等基本特征数据,通常在编码器初始化时使用这些数据;
(2)片信息是片的头信息,片(slice)是高效率视讯编码(High Efficiency Video Coding,HEVC)的图像划分单位,一帧图像可以划分为多个slice,也可以作为一个slice,很多情况下为了简化编解码,一帧图像作为一个slice编码。片的头信息包含slice所用到的一些用于对编码实现配置的编码参数。
(3)编码树单元(Coding Tree Unit,简称CTU)是HEVC视频编码的基本单位,CTU大小可以从8x8到64x64,一个slice可以包括一个或多个CTU。编码树单元信息,是编码树单元所用到的编码参数。
视频头信息、片信息和编码树单元信息构成HECV标准的原视频的编码信息,在对原视频解析过程中较方便地获得这些信息。并且,视频头信息、片信息和编码树单元信息较全面地描述了HECV标准的原视频被编码器编码过程中的参数,从而使得步骤S106中,编码器在使用编码信息对添加图文信息后的视频帧序列进行编码后,得到的新视频和原视频保持较好的一致性,保证新视频质量不会受损。
对于HECV标准的原视频,视频头信息可以包括:视频参数集(Video Paramater Set,简称VPS)、序列参数集Sequence Parameter Set,简称SPS)和图像参数集(Picture Paramater Set,简称PPS)。
也就是说,在将原视频解码为视频帧序列的过程中,对原视频进行解析,获取到的帧信息包含在视频参数集、序列参数集和图像参数集中。从而,再次对增加图文信息后的视频序帧列进行编码时,编码器还能够参考视频参数集、序列参数集和图像参数集等数据,能够更好地复原原视频的视频特征。
在一种可能的实施方式中,PPS包含每一帧图像具有不同的设置信息,设置信息主要包括:自引信息、初始图像控制信息(如初始QP)、分块信息。在解码开始的时候,所有的PPS全部是非活动状态,而且在解码的任意时刻,最多只能有一个PPS处于激活状态。当某部分码流引用了某个PPS的时候,这个PPS便被激活,称为活动PPS,一直到另一个PPS被激活。
SPS提供视频序列中所有slice需要的信息,SPS的内容可以包括:解码相关信息,如档次级别、分辨率、子层数等;某档次中的功能开关标识及该功能的参数;对结构和变换系数编码灵活性的限制信息;时域可分级信息。
VPS用于解释编码过的视频序列的整体结构,包括时域子层依赖关系等。HEVC中加入该结构的主要目的是兼容标准在系统的多子层方面的扩展。对于给定视频序列的某一个子层,无论其SPS相不相同,都共享一个VPS。VPS其主要包含的信息有:多个子层或操作点共享的语法元素;档次和级别等会话关键信息;其他不属于SPS的操作点特定信息。
与HEVC不同的是,VP9没有VSP/SPS/PPS之类的视频头信息,只有图像帧级别的头部信息,每个图像帧都会有uncompressed header和compressed header的头部信息。因此,相应的帧信息存储在uncompressed header中。另外,uncompressed header还包含一些其他信息,例如sps、pps中的一些信息和片的信息。compressed header则是当前帧各语法元素熵编码所用的概率表。因此对于VP9能够从图像帧的头部信息中得到的编码信息包括帧信息和片信息。所述帧信息即为视频图像的基础特征信息。
另外,VP9标准采用图像帧/片/超级块/块的编码层次结构,图像帧之下可以划分为64x64大小的超级块,而片的划分是以超级块的边界进行划分的,其划分方式在uncompressed header中已经声明。超级块,英文名称为super block,简称SB,是VP9视频编码的基本编码单元,每个SB可以以四叉树的形式递归地划分为块(block)。在本申请中,超级块的编码参数,如SB的划分方式,块的编码模式,运动矢量mv,quantizer等等,作为编码单元信息。
本申请的另一个可选实施方式中,如图3所示,步骤S106,使用编码信息对添加图文信息后的视频帧序列进行编码,包括:
步骤S301,从解码器获取帧信息、片信息和编码单元信息,并获取添加图文信息后的视频帧序列。
步骤S302,使用帧信息,将编码器初始化。
帧信息表示原视频的基本特征信息,使用帧信息将编码器初始化,使得新视频和原视频在所使用编码器的配置参数方面保持了一致性。
步骤S303,将添加图文信息后的视频帧序列划分为第二基础编码单元。
在HEVC标准中,基础编码单元为编码树单元,在VP9中基础编码单元为超级块。第一基础编码单元和第二基础编码单元仅用于区分两个不同的基础编码单元。在本步骤中,将添加图文信息后的视频帧序列中的每个帧,按照光栅扫描顺序(从左到右,然后从上到下)划分成一个个固定大小的基础编码单元。
步骤S304,通过初始化后的编码器,根据片信息和所述编码单元信息,对第二基础编码单元进行编码。
HEVC标准中,将图像帧划分为编码树单元后,每个编码树单元可以以四叉树的形式递归地被划分为多层次的编码单元(Coding Unit,简称CU)。VP9标准中,将图像帧划分为超级块之后,每个超级块可以以四叉树的形式递归地被划分为多层次的块(block)。
对于HECV标准,编码单元信息包括CU深度、将CTU划分为CU过程中的划分方式。使用编码单元信息将第二编码树单元划分为编码单元,使得编码单元的划分和原视频编码过程中编码单元划分一致。在将CTU划分为CU后,以CU为单位进行帧内、帧间预测,离散余弦变换(Discrete Cosine Transform,简称DCT)和量化,然后对变换和量化后的残差系数进行游程扫描,最后进行熵编码,以完成编码过程。片信息包括有帧显示顺序、参考帧数和参考数据集信息等,编码树单元信息包括有、CU深度和划分方式、编码模式、量化参数QP,样点自适应补偿(Sample Adaptive Offset,简称SAO)参数等。使用片信息和编码树单元信息对第二编码树单元进行编码,保证了新视频和原视频在对编码单元编码方面保持了一致性。
对于VP9标准,编码单元信息包括block深度、将超级块划分为block过程中的划分方式。使用编码单元信息将第二超级块划分为多个块,使得块的划分和原视频编码过程中块的划分一致。在将超级块划分为块后,以块为单位进行帧内、帧间预测,离散余弦变换(Discrete Cosine Transform,简称DCT)和量化,然后对变换和量化后的残差系数进行游程扫描,最后进行熵编码,以完成编码过程。片信息包括有帧显示顺序、参考帧数和参考数据集信息等,编码树单元信息包括有、CU深度和划分方式、编码模式、量化参数QP,样点自适应补偿(Sample Adaptive Offset,简称SAO)参数等。使用片信息和编码树单元信息对第二编码树单元进行编码,保证了新视频和原视频在对编码单元编码方面保持了一致性。
本申请实施例中,新视频和原视频在所使用编码器的配置参数方面保持 了一致性,编码单元的划分保持了一致性,对编码单元的编码保持了一致性,从而使得新视频和原视频在视频质量方面保持一致,缓解视频质量受损的技术问题。
本申请的另一个可选实施方式中,如图4所示,步骤S304,根据片信息和编码树单元信息,对第二编码树单元进行编码,包括:
步骤S401,获取各个第二基础编码单元的位置信息;
步骤S402,依次基于位置信息,判断当前第二基础编码单元是否与图文信息的覆盖区域有关,得到判断结果;
该步骤的判断结果为:当前第二基础编码单元与图文信息的覆盖区域有关,或者,当前第二基础编码单元不与图文信息的覆盖区域有关。
步骤S403,根据判断结果,确定是否使用片信息和编码单元信息对第二基础编码单元进行编码。
当前第二编码树单元与图文信息的覆盖区域无关时,说明添加图文信息并没有改变当前第二基础编码单元,因而当前第二基础编码单元保持不变,此情况下当前第二基础编码单元的编码决策不变,从而保持和原视频质量的一致性。
当前第二编码树单元与图文信息覆盖的区域有关时,说明添加图文信息并改变了当前第二基础编码单元,因而当前第二基础编码单元发生变化,此情况下重新确定对第二基础编码单元进行编码所使用的编码决策,使用重新确定的编码决策对所述第二编码单元进行编码。
本申请实施例中,由第二基础编码单元与图文信息的覆盖区域的关系来确定是否使用片信息和编码单元信息对第二基础编码单元进行编码,充分考虑了覆盖区域对第二基础编码单元的影响,使得第二基础编码单元的编码决策更加合理科学。
本申请的另一个可选实施方式中,步骤S402,依次基于位置信息,判断当前第二编码树单元是否与图文信息的覆盖区域有关,包括:
判断当前第二基础编码单元是否满足第一条件和第二条件中的任一条件,其中,
第一条件为当前第二基础编码单元位于图文信息覆盖的区域;
第二条件为当前第二基础编码单元为帧间模式且满足以下任一条件:参考了覆盖区域的图像,视频运动矢量预测受到目标编码树单元的影响,其中,目标编码树单元为当前第二基础编码单元相邻的已被判为与覆盖区域有关的第二基础编码单元。
在当前第二基础编码单元满足第一条件和第二条件中的任一条件时,则步骤S402的判断结果为:当前第二基础编码单元与图文信息的覆盖区域有关; 在当前第二基础编码单元既不满足第一条件也不满足第二条件时,则步骤S402的判断结果为:当前第二基础编码单元与图文信息的覆盖区域无关。
本申请实施例中,判断当前第二基础编码单元是否与图文信息的覆盖区域有关,不仅考虑了第一条件,还考虑了第二条件,较全面地考虑了与图文信息的覆盖区域有关的情形,全面且精确地确定了覆盖区域对当前第二编码树单元的影响。
在步骤S403中,根据判断结果,确定是否使用片信息和编码单元信息对第二基础编码单元进行编码,进一步包括:
在判断结果为当前第二基础编码单元与图文信息的覆盖区域无关的情况下,使用片信息和编码单元信息对第二基础编码单元进行编码。
在判断结果为当前第二基础编码单元与图文信息覆盖的区域有关的情况下,重新确定对第二基础编码单元进行编码所使用的编码决策,使用重新确定的编码决策对所述第二编码单元进行编码。
具体地,在判断结果为当前第二基础编码单元与图文信息的覆盖区域有关的情况下,则重新确定对第二基础编码单元进行编码所使用的编码决策,具体包括确定CU或块深度和划分方式、编码方式等。
本申请实施例中,在当前第二基础编码单元与图文信息的覆盖区域无关的情况下,使用片信息和编码单元信息对第二基础编码单元进行编码,即无需对当前第二编码单元进行编码决策的计算。由于一般情况下与覆盖区域无关的第二基础编码单元占据较大比例,因而很大程度上减少了编码器编码过程中的计算量,减轻了处理器的负载,加快了转码,缓解了传统转码方法耗时长的技术问题。
本申请的另一个可选实施方式中,编码器和解码器通信连接,以便编码器从解码器获取编码信息,其中,
解码器以第一数据结构和第一数据排列方式向编码器传送编码信息;
编码器以第二数据结构和第二数据排列方式接收来自解码器的编码信息,其中,
第二数据结构和第一数据结构相同,第二数据排列方式和第一数据排列方式相同。
本申请实施例中,编码器和解码器通信连接,编码器和解码器之间以相同的数据结构和相同的数据排列方式实现编码信息的传递,保证了编码信息在编码器和解码器之间快速且准确的传递。
本申请的另一个可选实施方式中,编码器和解码器通信连接,以便编码器从解码器获取编码信息,其中,
解码器以第三数据结构和第三数据排列方式向编码器传送编码信息;
编码器接收到编码信息后,根据映射关系将编码信息按照第四数据结构和第四数据排列方式进行存储,其中,
第四数据结构和第三数据结构不同,和/或,第四数据排列方式和第三数据排列方式不同;
映射关系为第一位置和第二位置之间的对应关系,第一位置为编码信息在第三数据结构和第三数据排列方式中的位置,第二位置为编码信息在第四数据结构和第四数据排列方式中的位置。
具体地,假设编码信息包括A、B、C,上述编码信息在第三数据结构和第三数据排列方式中的排列顺序为ACB,在第四数据结构和第四数据排列方式中的排列顺序为ABC,则映射关系如图5中的箭头所示。
例如,编码器和解码器在保存量化参数这个编码信息时:可能由于计算方法的问题,一个以图像水平方向的编码单元个数为数据的单位行宽,另一个以图像水平方向的编码单元个数加1为数据的单位行宽,即编码器和解码器在数据结构方面不同,它们在进行数据通信时,编码器则根据映射关系逐行的把第三数据结构中的量化参数数组映射到第四数据结构的量化参数数组并存储。
又比如,编码器和解码器在保存运动矢量信息这个编码信息时:一个是以最小预测单元为存储单位,在整个图像上按照光栅扫描顺序存放运动矢量信息;另一个是先在一个编码树单元内部以最小预测单元为存储单位,按照光栅扫描顺序存放运动矢量信息,形成多个编码树单元,然后在整个图像上按照光栅扫描顺序存放各个编码树单元,即,编码器和解码器在数据排列方式方面不同,它们在进行数据通信时,编码器则根据映射关系将某个最小预测单元在第三数据排列中的坐标转换成在第四数据排列中的坐标,完成解码器到编码器之间数据的通信和获取。
需要说明的是,本申请实施例的编码信息可能为多个,多个编码信息可能同时包括以下三种情形:
(1)第四数据结构和第三数据结构不同,且第四数据排列方式和第三数据排列方式不同;
(2)第四数据结构和第三数据结构不同,且第四数据排列方式和第三数据排列方式相同;
(3)第四数据结构和第三数据结构相同,且第四数据排列方式和第三数据排列方式不同,多个编码信息也可能包括上述三种情形中的任一种或者任两种。
无论多个编码信息包括上述三种情形中的几种,多个编码信息中的每个编码信息都有一个对应的映射关系,编码器根据编码信息对应的映射关系,将 编码信息按照第四数据结构和第四数据排列方式进行存储。
本申请实施例中,在解码器和编码器之间数据结构和/或数据排列方式不同情形下,编码器通过映射关系实现了从解码器有序获取编码信息的目的,尤其适应于解码器和编码器开发者不同的情况。
本申请实施例二提供一种转码装置,如图6所示,包括:
获取模块100,用于通过解码器对原视频进行解码,获得所述原视频的视频帧序列及所述原视频的编码信息,其中,所述原视频是需要添加图文信息的视频;
添加模块200,用于给所述视频帧序列添加所述图文信息,得到添加图文信息后的视频帧序列;
编码模块300,用于通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,得到新视频。
本申请实施例提供转码装置,解码器解码过程中获得编码信息,编码信息的获取方便快捷;编码器基于编码信息对添加图文信息后的视频帧序列进行编码,减少了计算编码决策所耗费的时间,且保证了新视频和原视频在分辨率、码率和帧率等信息上的一致性,大幅改善新视频的画质,从而缓解了传统转码方法耗时长且质量易受损的技术问题。
其中,获取模块100通过解码器对原视频进行解码,获取原视频的视频帧序列和编码信息。视频帧序列和编码信息可以存储在原视频的不同位置,解码器可以基于一次或多次的解析过程获得原视频的视频帧序列和编码信息。
进一步地,原视频的编码信息包括原视频的帧信息、片信息和编码单元信息,其中,帧信息为所述原视频的图像帧的视频特征数据,例如视频宽高等基本特征数据,所述片信息为原视频的每个片的编码参数,编码单元信息为组成原视频的每个图像帧的第一基础编码单元的编码参数。每个图像帧可以分成多个片,每个片可以分成多个基础编码单元。在不同的编码标准中,帧信息、片信息和编码单元信息的存储结构和在原视频中的存储位置可能不同,用于表示基础编码单元的术语可能不同。
本申请实施例的一个可选实施方式中,获取模块具体用于:
将原视频解码为视频帧序列;
在将所述原视频解码为所述视频帧序列的过程中,对原视频进行解析,获取原视频的编码信息。
本申请实施例的一个可选实施方式中,原视频的编码信息包括:原视频的视频头信息、片信息和编码树单元信息,其中,视频头信息为原视频的视频特征数据,片信息为第一片的编码参数,编码树单元信息为第一编码树单 元的编码参数,且第一片和第一编码树单元属于原视频。
本申请实施例的另一个可选实施方式中,视频头信息包括:视频参数集、序列参数集和图像参数集。
本申请实施例的另一个可选实施方式中,编码模块具体用于:
第二获取单元,用于从原视频的编码信息获取所述原视频的帧信息、片信息和编码单元信息,并获取所述添加图文信息后的视频帧序列;
使用所述帧信息,将所述编码器初始化;
将所述添加图文信息后的视频帧序列划分为第二基础编码单元;
通过初始化后的所述编码器,根据所述片信息和所述编码单元信息,对所述第二基础编码单元进行编码。
本申请实施例的另一个可选实施方式中,编码单元具体用于:
获取各个第二基础编码单元的位置信息;
依次基于位置信息,判断当前第二基础编码单元是否与所述图文信息的覆盖区域有关,得到判断结果;
根据判断结果,确定是否使用片信息和编码单元信息对第二基础编码单元进行编码。
本申请实施例的另一个可选实施方式中,编码单元具体用于:
判断当前第二基础编码单元是否满足第一条件和第二条件中的任一条件,其中,
所述第一条件为当前第二基础编码单元位于图文信息覆盖的区域;
所述第二条件为当前第二基础编码单元为帧间模式且满足以下任一条件:参考了所述覆盖区域的图像,视频运动矢量预测受到目标编码树单元的影响,其中,所述目标编码树单元为当前所述第二编码树单元相邻的已被判为与所述覆盖区域有关的所述第二编码树单元。
本申请实施例的另一个可选实施方式中,编码单元具体用于:
在所述判断结果为当前所述第二基础编码单元与所述图文信息的覆盖区域无关的情况下,使用所述片信息和所述编码单元信息对所述第二基础编码单元进行编码。
本申请实施例的另一个可选实施方式中,编码器和解码器通信连接,以便编码器从解码器获取编码信息,其中,
解码器以第一数据结构和第一数据排列方式向编码器传送编码信息;
编码器以第二数据结构和第二数据排列方式接收来自解码器的编码信息,其中,
第二数据结构和第一数据结构相同,第二数据排列方式和第一数据排列方式相同。
本申请的另一个可选实施方式中,编码器和解码器通信连接,以便编码器从解码器获取编码信息,其中,
解码器以第三数据结构和第三数据排列方式向编码器传送编码信息;
编码器接收到编码信息后,根据映射关系将编码信息按照第四数据结构和第四数据排列方式进行存储,其中,
第四数据结构和第三数据结构不同,和/或,第四数据排列方式和第三数据排列方式不同;
映射关系为第一位置和第二位置之间的对应关系,第一位置为编码信息在第三数据结构和第三数据排列方式中的位置,第二位置为编码信息在第四数据结构和第四数据排列方式中的位置。
关于上述实施例中的交互装置,由于其中各个模块的功能已经在上述交互方法的实施例中进行了详细描述,由此进行了相对简略的描述。
本申请实施例三提供了一种转码装置,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为执行实施例一的转码方法。
本申请实施例中,处理器被配置为执行实施例一的转码方法,即,通过解码器对原视频进行解码,获得原视频的视频帧序列及原视频的编码信息,其中,原视频是需要添加图文信息的视频;给视频帧序列添加图文信息,得到添加图文信息后的视频帧序列;通过编码器,使用编码信息对添加图文信息后的视频帧序列进行编码,得到新视频。其中,解码器解码过程中获得编码信息,编码信息的获取方便快捷;基于编码信息对添加图文信息后的视频帧序列进行编码,减少了计算编码决策所耗费的时间,且保证了新视频和原视频在分辨率、码率和帧率等信息上的一致性,大幅改善新视频的画质,从而缓解了传统转码方法耗时长且质量易受损的技术问题。
图7所示为一种转码装置600的结构框图。参照图7,转码装置600可以包括以下一个或多个组件:处理组件602,存储器604,电源组件606,多媒体组件608,音频组件610,输入/输出(I/O)接口612,传感器组件614,以及通信组件616。
处理组件602通常控制转码装置600的整体操作,诸如与显示,数据通信,和记录操作相关联的操作。处理组件602可以包括一个或多个处理器620来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件602可以包括一个或多个模块,便于处理组件602和其他组件之间的交互。例如,处理组件602可以包括多媒体模块,以方便多媒体组件608和处理组件602之间的交互。
存储器604被配置为存储各种类型的数据以支持在转码装置600的操作。这些数据的示例包括用于在装置600上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器604可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件606为装置600的各种组件提供电力。电源组件606可以包括电源管理系统,一个或多个电源,及其他与为转码装置600生成、管理和分配电力相关联的组件。
多媒体组件608包括在装置600和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件608包括一个前置摄像头和/或后置摄像头。当设备600处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件610被配置为输出和/或输入音频信号。例如,音频组件610包括一个麦克风(MIC),当转码装置600处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器604或经由通信组件616发送。在一些实施例中,音频组件610还包括一个扬声器,用于输出音频信号。
I/O接口612为处理组件602和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件614包括一个或多个传感器,用于为装置600提供各个方面的状态评估。例如,传感器组件614可以检测到设备600的打开/关闭状态,组件的相对定位,例如组件为转码装置600的显示器和小键盘,传感器组件614还可以检测转码装置600或转码装置600一个组件的位置改变,用户与转码装置600接触的存在或不存在,转码装置600方位或加速/减速和转码装置600的温度变化。传感器组件614可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件614还可以包括光传感器, 如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件614还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件616被配置为便于转码装置600和其他设备之间有线或无线方式的通信。转码装置600可以接入基于通信标准的无线网络,如WiFi,运营商网络(如2G、3G、4G或5G),或它们的组合。在一个示例性实施例中,通信组件616经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信组件616还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,转码装置600可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
图8所示为另一种转码装置700的结构框图。例如,转码装置700可以被提供为一服务器。参照图8,转码装置700包括处理组件722,其进一步包括一个或多个处理器,以及由存储器732所代表的存储器资源,用于存储可由处理组件722的执行的指令,例如应用程序。存储器732中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件722被配置为执行指令,以执行上述信息列表显示方法。
转码装置700还可以包括一个电源组件726被配置为执行转码装置700的电源管理,一个有线或无线网络接口750被配置为将转码装置700连接到网络,和一个输入输出(I/O)接口758。转码装置700可以操作基于存储在存储器732的操作系统,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。
本申请实施例四提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机指令,计算机指令被执行时实现实施例一的转码方法。
具体地,计算机可读存储介质,例如包括指令的存储器604,上述指令可由转码装置600的处理器620执行以完成上述方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本申请实施例中,计算机指令被执行时实现实施例一的转码方法,即,通过解码器对原视频进行解码,获得原视频的视频帧序列及原视频的编码信息,其中,原视频是需要添加图文信息的视频;给视频帧序列添加图文信息,得 到添加图文信息后的视频帧序列;通过编码器,使用编码信息对添加图文信息后的视频帧序列进行编码,得到新视频。其中,解码器解码过程中获得编码信息,编码信息的获取方便快捷;基于编码信息对添加图文信息后的视频帧序列进行编码,减少了计算编码决策所耗费的时间,且保证了新视频和原视频在分辨率、码率和帧率等信息上的一致性,大幅改善新视频的画质,从而缓解了传统转码方法耗时长且质量易受损的技术问题。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。
Claims (20)
- 一种转码方法,包括:通过解码器对原视频进行解码,获得所述原视频的视频帧序列及所述原视频的编码信息,其中,所述原视频是需要添加图文信息的视频;给所述视频帧序列添加所述图文信息,得到添加图文信息后的视频帧序列;通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,得到新视频。
- 根据权利要求1所述的方法,所述原视频的编码信息包括:所述原视频的帧信息、片信息和编码单元信息,其中,所述帧信息为所述原视频的每个图像帧的视频特征数据,所述片信息为原视频的每个片的编码参数,所述编码单元信息为组成所述原视频的每个图像帧的第一基础编码单元的编码参数。
- 根据权利要求2所述的方法,所述通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,包括:使用所述帧信息,将所述编码器初始化;将所述添加图文信息后的视频帧序列划分为第二基础编码单元;通过初始化后的所述编码器,根据所述片信息和所述编码单元信息,对所述第二基础编码单元进行编码。
- 根据权利要求3所述的方法,所述根据所述片信息和所述编码单元信息,对所述第二编码单元进行编码包括:获取各个所述第二基础编码单元的位置信息;依次基于所述位置信息,判断当前所述第二基础编码单元是否与所述图文信息的覆盖区域有关,得到判断结果;根据所述判断结果,确定是否使用所述片信息和所述编码单元信息对所述第二编码树单元进行编码。
- 根据权利要求4所述的方法,所述依次基于所述位置信息,判断当前所述第二编码单元是否与所述图文信息的覆盖区域有关,包括:判断当前所述第二基础编码单元是否满足第一条件和第二条件中的任一条件,其中,所述第一条件为当前所述第二基础编码单元位于所述图文信息覆盖的区域;所述第二条件为当前所述第二基础编码单元为帧间模式且满足以下任一条件:参考了所述图文信息覆盖的区域的图像,视频运动矢量预测受到目标 编码树单元的影响,其中,所述目标编码树单元为当前所述第二基础编码单元相邻的已被判为与所述覆盖区域有关的所述第二基础编码单元。
- 根据权利要求4所述的方法,所述根据所述判断结果,确定是否使用所述片信息和所述编码单元信息对所述第二编码单元进行编码,包括:在所述判断结果为当前所述第二基础编码单元与所述图文信息覆盖的区域无关的情况下,使用所述片信息和所编码单元信息对所述第二编码单元进行编码。
- 根据权利要求4所述的方法,所述根据所述判断结果,确定是否使用所述片信息和所述编码单元信息对所述第二编码单元进行编码,包括:在所述判断结果为当前所述第二基础编码单元与所述图文信息覆盖的区域有关的情况下,重新确定对第二基础编码单元进行编码所使用的编码决策,使用重新确定的编码决策对所述第二编码单元进行编码。
- 根据权利要求2至7任一项所述的方法,所述原视频的编码标准为HECV,所述帧信息存储在所述原视频的头部信息中,所述第一基础编码单元为编码树单元。
- 根据权利要求8所述的方法,所述原视频的头部信息还包括视频参数集、序列参数集和图像参数集。
- 根据权利要求2至7任一项所述的方法,所述原视频的编码标准为VP9,所述原视频的帧信息包含在每个图像帧的头部信息中,所述第一基础编码单元为超级块。
- 根据权利要求1至7任一项所述的方法,所述图文信息包括图片水印、音频水印、字幕、弹幕、画中画、贴纸和魔法表情中的至少一种。
- 一种转码装置,包括:获取模块,用于通过解码器对原视频进行解码,获得所述原视频的视频帧序列及所述原视频的编码信息,其中,所述原视频是需要添加图文信息的视频;添加模块,用于给所述视频帧序列添加所述图文信息,得到添加图文信息后的视频帧序列;编码模块,用于通过编码器,使用所述编码信息对所述添加图文信息后的视频帧序列进行编码,得到新视频。
- 根据权利要求12所述的转码装置,所述获取模块获取的原视频的编码信息包括:所述原视频的帧信息、片信息和编码单元信息,其中,所述视频头信息为所述原视频的帧图像的视频特征数据,所述片信息为原视频的片的编码参数,所述编码单元信息为组成所述原视频的每帧图像的第一基础编码单元的编码参数。
- 根据权利要求12所述的转码装置,所述编码模块具体用于:从原视频的编码信息获取所述原视频的帧信息、片信息和编码单元信息,并获取所述添加图文信息后的视频帧序列;使用所述帧信息,将所述编码器初始化;将所述添加图文信息后的视频帧序列划分为第二基础编码单元;通过初始化后的所述编码器,根据所述片信息和所述编码单元信息,对所述第二基础编码单元进行编码。
- 根据权利要求14所述的转码装置,所述编码模块具体用于:获取各个所述第二基础编码单元的位置信息;依次基于所述位置信息,判断当前所述第二基础编码单元是否与所述图文信息的覆盖区域有关,得到判断结果;根据所述判断结果,确定是否使用所述片信息和所述编码单元信息对所述第二基础编码单元进行编码。
- 根据权利要求15所述的转码装置,所述编码模块具体用于:判断当前所述第二基础编码单元是否满足第一条件和第二条件中的任一条件,其中,所述第一条件为当前所述第二基础编码单元位于所述图文信息覆盖的区域;所述第二条件为当前所述第二基础编码单元为帧间模式且满足以下任一条件:参考了所述覆盖区域的图像,视频运动矢量预测受到目标编码树单元的影响,其中,所述目标编码树单元为当前所述第二编码树单元相邻的已被判为与所述覆盖区域有关的所述第二编码树单元。
- 根据权利要求15所述的转码装置,所述编码模块具体用于:在所述判断结果为当前所述第二基础编码单元与所述图文信息的覆盖区域无关的情况下,使用所述片信息和所述编码单元信息对所述第二基础编码单元进行编码。
- 根据权利要求12至17任一项所述的转码装置,所述图文信息包括图片水印、音频水印、字幕、弹幕、画中画、贴纸和魔法表情中的至少一种。
- 一种转码装置,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行上述权利要求1至11任一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令被执行时实现如权利要求1至11任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811427461.1A CN111225277A (zh) | 2018-11-27 | 2018-11-27 | 转码方法、转码装置和计算机可读存储介质 |
CN201811427461.1 | 2018-11-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020108033A1 true WO2020108033A1 (zh) | 2020-06-04 |
Family
ID=70828837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/106804 WO2020108033A1 (zh) | 2018-11-27 | 2019-09-19 | 转码方法、转码装置和计算机可读存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111225277A (zh) |
WO (1) | WO2020108033A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111953987A (zh) * | 2020-07-15 | 2020-11-17 | 广州柯维新数码科技有限公司 | 视频转码方法、计算机设备和存储介质 |
CN111953988A (zh) * | 2020-07-15 | 2020-11-17 | 广州柯维新数码科技有限公司 | 视频转码方法、计算机设备和存储介质 |
CN112511836A (zh) * | 2020-11-10 | 2021-03-16 | 北京达佳互联信息技术有限公司 | 一种水印文件的转码方法、装置、服务器和存储介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022184110A1 (zh) * | 2021-03-02 | 2022-09-09 | 北京字节跳动网络技术有限公司 | 用于图像编码的方法、电子设备、存储介质和记录介质 |
CN113014926B (zh) * | 2021-04-30 | 2023-04-07 | 北京汇钧科技有限公司 | 视频的转码方法、装置、电子设备及存储介质 |
CN113139057A (zh) * | 2021-05-11 | 2021-07-20 | 青岛科技大学 | 一种域适应的化工安全隐患短文本分类方法及系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100074333A1 (en) * | 2008-09-19 | 2010-03-25 | The Hong Kong University Of Science And Technology | Method and system for transcoding based robust streaming of compressed video |
CN103248898A (zh) * | 2013-05-22 | 2013-08-14 | 张炯 | 数字水印添加、提取方法及其装置 |
CN104641651A (zh) * | 2012-06-12 | 2015-05-20 | 相干逻辑公司 | 用于编码和交付视频内容的分布式体系结构 |
CN105263024A (zh) * | 2015-10-15 | 2016-01-20 | 宁波大学 | 一种抗量化转码的hevc视频流零水印的注册和检测方法 |
CN108769828A (zh) * | 2018-05-23 | 2018-11-06 | 深圳市网心科技有限公司 | 图片水印添加方法、电子装置及计算机可读存储介质 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100539704C (zh) * | 2005-12-08 | 2009-09-09 | 香港中文大学 | 视频信号的编码系数的转换装置及其方法 |
US9894361B2 (en) * | 2009-03-31 | 2018-02-13 | Citrix Systems, Inc. | Framework for quality-aware video optimization |
RU2010135495A (ru) * | 2010-08-24 | 2012-02-27 | ЭлЭсАй Корпорейшн (US) | Видеотранскодер с гибким управлением качеством и сложностью |
WO2018049594A1 (en) * | 2016-09-14 | 2018-03-22 | Mediatek Inc. | Methods of encoder decision for quad-tree plus binary tree structure |
-
2018
- 2018-11-27 CN CN201811427461.1A patent/CN111225277A/zh active Pending
-
2019
- 2019-09-19 WO PCT/CN2019/106804 patent/WO2020108033A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100074333A1 (en) * | 2008-09-19 | 2010-03-25 | The Hong Kong University Of Science And Technology | Method and system for transcoding based robust streaming of compressed video |
CN104641651A (zh) * | 2012-06-12 | 2015-05-20 | 相干逻辑公司 | 用于编码和交付视频内容的分布式体系结构 |
CN103248898A (zh) * | 2013-05-22 | 2013-08-14 | 张炯 | 数字水印添加、提取方法及其装置 |
CN105263024A (zh) * | 2015-10-15 | 2016-01-20 | 宁波大学 | 一种抗量化转码的hevc视频流零水印的注册和检测方法 |
CN108769828A (zh) * | 2018-05-23 | 2018-11-06 | 深圳市网心科技有限公司 | 图片水印添加方法、电子装置及计算机可读存储介质 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111953987A (zh) * | 2020-07-15 | 2020-11-17 | 广州柯维新数码科技有限公司 | 视频转码方法、计算机设备和存储介质 |
CN111953988A (zh) * | 2020-07-15 | 2020-11-17 | 广州柯维新数码科技有限公司 | 视频转码方法、计算机设备和存储介质 |
CN111953987B (zh) * | 2020-07-15 | 2022-08-09 | 广州柯维新数码科技有限公司 | 视频转码方法、计算机设备和存储介质 |
CN112511836A (zh) * | 2020-11-10 | 2021-03-16 | 北京达佳互联信息技术有限公司 | 一种水印文件的转码方法、装置、服务器和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111225277A (zh) | 2020-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020108033A1 (zh) | 转码方法、转码装置和计算机可读存储介质 | |
US10701401B2 (en) | Syntax structures indicating completion of coded regions | |
US11665362B2 (en) | Syntax and semantics for buffering information to simplify video splicing | |
US11095877B2 (en) | Local hash-based motion estimation for screen remoting scenarios | |
US10390039B2 (en) | Motion estimation for screen remoting scenarios | |
US11711511B2 (en) | Picture prediction method and apparatus | |
US20150237356A1 (en) | Host encoder for hardware-accelerated video encoding | |
KR20210107865A (ko) | 비디오 디코딩 방법, 비디오 코딩 방법, 장치, 디바이스, 및 저장 매체 | |
CN113273185A (zh) | 视频编码中的段类型 | |
US12034954B2 (en) | Encoder and decoder with support of sub-layer picture rates in video coding | |
US10129566B2 (en) | Standard-guided video decoding performance enhancements | |
KR20230098717A (ko) | 인코딩 방법, 인코딩된 비트스트림 및 인코딩 디바이스 | |
CN111225211A (zh) | 转码方法、转码装置和计算机可读存储介质 | |
US9979983B2 (en) | Application- or context-guided video decoding performance enhancements | |
WO2020062184A1 (zh) | 一种图像处理方法、装置、可移动平台及存储介质 | |
US11197014B2 (en) | Encoding apparatus, decoding apparatus, and image processing system | |
BR112015016254B1 (pt) | Método realizado por um dispositivo de computação, mídia legível por computador e dispositivo de computação |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19891645 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19891645 Country of ref document: EP Kind code of ref document: A1 |