WO2023071469A1 - Video processing method, electronic device and storage medium - Google Patents

Video processing method, electronic device and storage medium Download PDF

Info

Publication number
WO2023071469A1
WO2023071469A1 PCT/CN2022/114283 CN2022114283W WO2023071469A1 WO 2023071469 A1 WO2023071469 A1 WO 2023071469A1 CN 2022114283 W CN2022114283 W CN 2022114283W WO 2023071469 A1 WO2023071469 A1 WO 2023071469A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
sub
processing method
saliency
server
Prior art date
Application number
PCT/CN2022/114283
Other languages
French (fr)
Chinese (zh)
Inventor
许静
孟宇
王剑楠
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023071469A1 publication Critical patent/WO2023071469A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data

Definitions

  • the present application relates to but not limited to the field of visual technology, and in particular relates to a video processing method, electronic equipment and storage media.
  • VR virtual reality
  • the development of virtual reality technology has been widely known to the present, and as the main bearing form of virtual reality related resources and the main consumption content of virtual reality users, 360-degree panoramic video based on VR is accepted by more and more users. and consumption.
  • 360-degree panoramic video is usually shot by multiple lenses at the same time when shooting, and then the content is corrected for distortion and spliced to form a complete panoramic content.
  • the resolution of VR is getting higher and higher, the increase in the amount of data will also lead to an increase in the amount of calculation in the process of encoding and decoding, which leads to an increase in the transmission bandwidth occupied by it, which greatly increases the use of computing and storage resources. Consumption, affecting the viewing quality of the video.
  • Embodiments of the present application provide a video processing method, electronic equipment, and a storage medium.
  • the embodiment of the present application provides a video processing method, which is applied to a server.
  • the video processing method includes: obtaining the original video of the film source; performing saliency calculation on the original video to obtain the saliency distribution information; block the original video to obtain multiple sub-videos; encode and compress the sub-videos according to the saliency distribution information.
  • the embodiment of the present application provides a video processing method, which is applied to a playback terminal, and the video processing method includes: sending a playback request of a film source to a server, so that the server determines the film source: receiving the coded and compressed sub-video corresponding to the film source sent by the server, the coded and compressed sub-video is calculated by the server according to the original video of the film source to obtain the original saliency distribution information on the video, and after the original video is divided into blocks to obtain multiple sub-videos, the server encodes and compresses the sub-videos according to the saliency distribution information; according to the encoded and compressed After the sub-video is decoded, the playback video is obtained.
  • the embodiment of the present application provides an electronic device, which includes: a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the implementation of the first aspect of the present application is implemented.
  • the embodiment of the present application provides a computer-readable storage medium, wherein the storage medium stores a program, and the program is executed by a processor to implement any one of the embodiments of the first aspect of the application.
  • Fig. 1 is a schematic flow chart of a video processing method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a video processing method provided by another embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a video processing method provided in another embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video processing method provided in another embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a video processing method provided by another embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a video processing method provided by another embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a video processing method provided by another embodiment of the present application.
  • Fig. 8 is a schematic flowchart of a video processing method provided by another embodiment of the present application.
  • Fig. 9 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • orientation descriptions such as up, down, front, back, left, right, etc. indicated orientations or positional relationships are based on the orientations or positional relationships shown in the drawings, and are only For the convenience of describing the present application and simplifying the description, it does not indicate or imply that the referred device or element must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the embodiments of the present application.
  • multiple means more than two, greater than, less than, exceeding, etc. are understood as not including the original number, and above, below, within, etc. are understood as including the original number. If there is a description of "first”, “second”, etc., it is only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.
  • the embodiment of the present application provides a video processing method, electronic equipment and storage medium.
  • the video processing method can be applied in servers and playback terminals.
  • the predicted saliency distribution information realizes the encoding and compression of different sub-videos after division, so that the size of the encoded and compressed sub-videos corresponds to the characteristics of user behavior, so as to realize low-bandwidth video transmission while reducing calculation and storage resource consumption.
  • the embodiment of the present application provides a video processing method applied to a server.
  • the video processing method in the embodiment of the present application includes but not limited to step S110, step S120, step S130 and step S140.
  • Step S110 acquiring the original video of the film source.
  • the server first obtains the original video of the film source.
  • the film source can be an on-demand film source or a live film source for playback by the playback terminal.
  • the film source can be obtained in any form, and the injected film source codes and compresses the format and file
  • the encapsulation format is not limited, and the original video is the video source that has not been processed by the video processing method in this application.
  • the server in the embodiment of this application can be a video server, and the original video can be panoramic video or other types of video. When the original video When it is a panoramic video, the video processing method in the embodiment of the present application can be applied to the field of VR technology, and the processed video can be played by a VR playback terminal.
  • the video processed by the video processing method can be played by playing terminals such as mobile phones, tablet computers, and video players.
  • the original video may be referred to as a panoramic video or a 360-degree panoramic video.
  • Step S120 performing saliency calculation on the original video to obtain saliency distribution information on the original video.
  • the server uses a saliency prediction algorithm to calculate the saliency of the 360-degree panoramic video, and obtains a two-dimensional matrix of saliency distribution corresponding to the length, width or resolution of the panoramic video.
  • the length, width or resolution of the obtained saliency distribution two-dimensional matrix can be the same as the original video, and the saliency distribution two-dimensional matrix is used as the saliency distribution information on the original video, and the saliency distribution two-dimensional matrix marks the panoramic video.
  • the saliency value of each pixel The main purpose of the saliency calculation here is to obtain the prediction result of user behavior.
  • the saliency prediction algorithm in the embodiment of the application can obtain the saliency value of each pixel in the panoramic video. There is no specific limitation on the specific algorithm in this embodiment of the application. It can be understood that the embodiment of this application can give different prediction accuracy results according to different saliency prediction algorithms, and the matching saliency prediction algorithm is selected and applied to the video processing method. middle.
  • 360-degree panoramic video also introduces some new features compared with traditional flat video.
  • traditional flat video usually has only one lens to shoot content
  • 360-degree panoramic video is usually shot by multiple lenses at the same time, and then the content is distorted and spliced to form a complete 360-degree panoramic content , and in the subsequent transmission and storage process, the 360-degree panoramic content will be mapped onto the plane in a non-uniform manner for compression encoding and transmission.
  • the head movement data of different users when watching the same 360-degree panoramic video content shows a high degree of consistency, that is, the areas that different users tend to watch It is similar to the length of time spent watching. Furthermore, in an immersive viewing environment, after a quick observation of the entire scene, users tend to fixate on certain areas. According to the characteristics of the user behavior, the calculation of the saliency of the complete 360-degree panoramic video can obtain the two-dimensional matrix of the saliency distribution corresponding to the characteristics of the user behavior.
  • Saliency also known as saliency, is an important visual feature in a video or image, reflecting the degree to which the human eye attaches importance to certain areas of the image.
  • the user For a video, the user is only interested in some areas in the video, or the area in the video that the user tends to watch. This part of the interested area represents the user's query intention, while other uninterested areas are related to The user's query intention is irrelevant, and the area characterized by salience is the area in the video that most arouses the user's interest and expresses the video content.
  • Step S130 divide the original video into blocks to obtain multiple sub-videos.
  • the server divides the panoramic video into blocks (TILE) to obtain sub-videos on multiple blocks, and the server can divide the panoramic video into blocks of the same or different sizes to obtain multiple sub-videos of the same or different sizes,
  • the service can divide sub-videos with a slightly larger block size according to the direction in which the user's viewing angle tends to be higher in the panoramic video, and the sub-videos in other areas can be divided into smaller blocks, which can be set according to actual needs, so as to improve video quality. Encoding and decoding effect to improve the playback quality of the sub-block sub-video. It should be noted that the number of sub-blocks can be set according to actual needs.
  • the server divides the panoramic video with a resolution of 3840X1920 into blocks according to the length and width of 4x3 , to obtain 12 sub-videos in total, the resolution of each sub-video is 960x640, and the server can adjust the number of sub-videos obtained by the sub-videos according to the bandwidth requirement, when the 12 sub-videos obtained in the above-mentioned embodiment
  • the bandwidth meets the design requirements, that is, there is no need to perform other numbers of blocks, and the server can set the number of blocks according to actual needs or artificial intelligence algorithms.
  • the more blocks, the obtained sub-video The more, the more obvious the bandwidth reduction effect can be achieved, but the required processing resources will increase.
  • the server can set the corresponding number of blocks according to the bandwidth requirements to meet the bandwidth requirements and processing requirements. This application does not make specific restrictions on it .
  • a window-dependent transmission mechanism is a necessary and feasible panoramic video transmission scheme.
  • the window-dependent transmission mechanism is to transmit some content that the user is highly concerned about in a high-quality form, or the user's current The content of the window that is being watched, and the content of other areas that will not be seen temporarily, the transmission quality can be appropriately reduced, or even not transmitted.
  • the embodiment of the present application adopts the Tiled Streaming strategy, which mainly divides the uncompressed original high-resolution video into several low-resolution video frame by frame in the pixel domain.
  • the spatial block video of high rate, and the sub-video of each block is coded into independently decodable media content, and such scheme usually needs to carry out the coding of different quality to each block, in one embodiment, for the user's
  • the playback terminal can download the high-quality segmented content of the corresponding area from the server side, and download the low-quality segmented content of the corresponding area for the area outside the viewing angle.
  • the block in the embodiment of the present application can be understood as, in High Efficiency Video Coding (HEVC), an image can be divided into several Tile, that is, horizontally and vertically The image is divided into several rectangular areas, and these rectangular areas are called blocks, and the blocks can be defined by MPEG HEVC encoding, which is not specifically limited in this application.
  • HEVC High Efficiency Video Coding
  • Step S140 encoding and compressing the sub-video according to the saliency distribution information
  • the server encodes and compresses the sub-videos according to the saliency distribution information. It can compress and encode all the obtained sub-videos, and can also compress and encode the sub-videos in some areas.
  • HEVC High Efficiency Video Coding
  • the main purpose of this embodiment of the application is to obtain the prediction results of user behavior through saliency calculation. Based on the saliency distribution information obtained by different predictions, the encoding and compression of different sub-videos after division is realized, so that the encoded and compressed sub-videos The size corresponds to the characteristics of user behavior, so that low-bandwidth video transmission can be achieved while reducing the consumption of computing and storage resources.
  • the user tends to watch the sub-videos in certain areas with higher quality after coding and compression, while the video quality of sub-videos in other positions is lower after coding and compression. low, and the user tends to watch a small proportion of the panoramic video in certain areas, that is, the number of high-quality sub-videos is less than the number of low-quality sub-videos, so after the video processing method in the embodiment of the present application, all Compared with the original video, the obtained compressed video can reduce the consumption of computing and storage resources while transmitting the video with low bandwidth.
  • step S120 may also include but not limited to step S210 and step S220.
  • Step S210 acquiring the first frame or key frame of the original video.
  • Step S220 performing saliency calculation on the original video according to the first frame or key frame, to obtain saliency distribution information on the first frame or key frame.
  • the saliency distribution information obtained by the server is calculated based on the first frame or key frame of the original video, and the server obtains the first frame or key frame of the original video according to the preset requirements.
  • the frame can be the most salient image frame in the original video selected by the user or the developer according to the actual needs.
  • the image screenshot of the original video at this frame can be obtained, and the server performs the salient degree calculation, and the calculation result is applied to other image frames in the panoramic video.
  • the saliency distribution information obtained at the first frame or key frame of the original video can represent the prediction result of the user behavior of the user on the original video, and the obtained saliency Distribution information is applied to the image frames of the entire video for encoding and compression processing.
  • the server can refresh the saliency distribution information periodically or aperiodically according to actual needs, and reacquire key frames to calculate new saliency distribution information. This application does not specifically limit it.
  • step S140 may also include but not limited to step S310 to step S330.
  • step S310 the saliency weight value corresponding to each sub-video is obtained according to the saliency distribution information.
  • step S320 the encoding parameters of each sub-video are obtained according to the saliency weight value.
  • Step S330 encoding and compressing the corresponding sub-video according to the encoding parameters.
  • the server calculates the corresponding weight of the sub-video saliency on each block, obtains the corresponding saliency value on each sub-video according to the saliency distribution information, and obtains the saliency value on the total panoramic video according to the saliency distribution information.
  • the total number of saliency, the saliency weight value corresponding to each sub-video is obtained from the ratio of the saliency value on the sub-video to the total saliency value, and the server assigns different encoding parameters according to the saliency weight value on each sub-video,
  • video files with different qualities can be obtained, so after the video processing method in the embodiment of the present application, the obtained compressed Compared with the original video, the video can achieve low-bandwidth video transmission while reducing the consumption of computing and storage resources.
  • step S320 may also include but not limited to steps S410 to S440.
  • Step S410 acquiring the pre-allocated total quality parameter corresponding to the original video.
  • Step S420 acquiring a preset quality parameter corresponding to the sub-video.
  • Step S430 obtain the pre-allocation quality parameter of the sub-video according to the pre-allocation total quality parameter and the saliency weight value.
  • step S440 the encoding parameters of the sub-video are determined according to the pre-assigned quality parameters and the preset quality parameters.
  • the server sets different encoding parameters for sub-videos according to different quality parameters, and the server obtains the pre-allocated total quality parameters corresponding to the panoramic video.
  • the total quality parameter that needs to be set is less than the original quality parameter of the original video, and the preset quality parameter corresponding to the sub-video is obtained.
  • the preset quality parameter can be the encoder’s preset minimum or maximum value for the sub-video, and then the server will The pre-assigned total quality parameter and saliency weight value are assigned to each sub-video to obtain the pre-assigned quality parameters of each sub-video, and the encoding parameters of the sub-video are determined according to the relationship between the pre-assigned quality parameter and the preset quality parameter, which can be understood
  • the quality parameter in the embodiment of the present application may be the video code rate, or other video coding parameters that affect image quality, such as quantization parameter (QP), key frame interval (GOP), resolution, frame rate, etc.
  • QP quantization parameter
  • GTP key frame interval
  • the server compares the size relationship between the pre-allocated quality parameter and the preset quality parameter, and obtains an appropriate one as an encoding parameter.
  • the encoding parameters obtained by the server are also different.
  • the quality parameter is the bit rate
  • the preset quality parameter is the preset bit rate
  • the preset bit rate corresponds to the encoding of the sub-video. It is related to the device and has a certain limit.
  • the server can use the pre-allocated quality parameter or the size of the preset quality parameter as the encoding parameter to encode and compress the sub-video.
  • the server can no longer assign higher quality parameters to the sub-video, so the server can encode and compress the sub-video according to the size of the preset quality parameter as an encoding parameter; another example, when the quality parameter is resolution , the preset quality parameter is the preset resolution, and the preset resolution is related to the sub-video size of the obtained block.
  • the server can use the size of the pre-allocated quality parameter as the encoding parameter as The sub-video is encoded and compressed.
  • the server can use the pre-allocated quality parameter or the size of the preset quality parameter as the encoding parameter to encode and compress the sub-video; the above examples are only examples and are not intended to Representation is a limitation on the embodiments of the application.
  • step S320 may also include but not limited to step S510 and step S520.
  • step S510 a difference is obtained according to the pre-allocated quality parameter and the preset quality parameter.
  • Step S520 obtaining an updated pre-allocated total quality parameter according to the difference value and the pre-allocated total quality parameter.
  • the server after obtaining the pre-allocated quality parameters of each sub-video, the server obtains the difference between the pre-allocated quality parameters and the preset quality parameters according to the value of the two, and uses the obtained difference and The original pre-allocated total quality parameter is calculated and updated to obtain the updated pre-allocated total quality parameter.
  • the pre-allocated total quality parameter can be a preset parameter whose size corresponds to the bandwidth, and then The updated pre-allocated total quality parameter is obtained through difference correction, so that the updated pre-allocated total quality parameter can be smaller, and low-bandwidth video transmission is realized.
  • the subsequently updated pre-allocation total quality parameter can be improved. , but for the overall panoramic video, the proportion of sub-videos whose pre-allocated quality parameters are higher than the preset quality parameters is not high. On the whole, the updated pre-allocated total quality parameters are compared with the original pre-allocated total quality parameters. The quality parameter is reduced.
  • the updated pre-allocated total quality parameter may be the same as the original pre-allocated total quality parameter, but it will be smaller than the original quality parameter of the original video. This embodiment of the present application does not Make specific restrictions.
  • a code rate selection algorithm is used to determine the specific encoding code rate of each sub-video, and the code rate selection algorithm steps are as follows:
  • R i R total w i , where i represents the i-th sub-video, and w i is the i-th sub-video corresponding Significance weight value.
  • the minimum code rate corresponding to the sub-video is R low
  • the highest bit rate corresponding to the sub-video is R high
  • the difference between the above steps 2) and 3) may not be used to update the pre-allocated total quality parameter.
  • the server may According to the preset quality parameter as the encoding parameter of the corresponding sub-video, for the encoding of the remaining sub-videos, the pre-assigned total quality parameter can be subtracted from the value of the sub-video whose encoding parameters have been allocated, and then the remaining pre-assigned total quality parameter value and the saliency weight values of the remaining sub-videos to obtain the pre-allocated quality parameters of the remaining sub-videos, thereby obtaining the encoding parameters of the remaining sub-videos; it can also be understood that, for example, when the preset quality parameter is greater than the pre-allocated quality parameter, The server can use the pre-allocation quality parameter as the encoding parameter of the corresponding sub-video, and after updating the pre
  • the video processing method in the embodiment of the present application may further include but not limited to step S610 and step S620.
  • Step S610 acquire the video encapsulation protocol.
  • Step S620 perform streaming media transmission encapsulation on the encoded and compressed sub-video according to the encapsulation protocol.
  • the server needs to carry out streaming media transmission encapsulation to the encoded video
  • the server can obtain its video encapsulation protocol according to the video playback device, and perform streaming media transmission encapsulation on the encoded and compressed sub-video according to different encapsulation protocols, For example, if some encapsulation protocols need to be encapsulated periodically, the server completes the encapsulation of the encoded and compressed sub-video once within a certain period.
  • the encapsulation protocols include but are not limited to HLS, DASH, and MSS, etc., and the embodiments of this application do not specify them. limit.
  • the server generates playback index files for each sub-video according to the encoded and compressed sub-videos, and adds information fields of sub-videos in blocks to the playback index file, and marks the block number and location information corresponding to each video file , used for player terminal identification.
  • the embodiment of the present application also provides a video processing method, which is applied to the playback terminal.
  • the video processing method in the embodiment of the present application includes but not limited to step S710, step S720 and step S730.
  • Step S710 sending a play request of the film source to the server, so that the server determines the film source according to the play request.
  • Step S720 receiving the coded and compressed sub-video corresponding to the film source sent by the server, the coded and compressed sub-video is calculated by the server according to the saliency of the original video of the film source to obtain the saliency distribution information on the original video, and the original video After obtaining multiple sub-videos by dividing into blocks, the server encodes and compresses the sub-videos according to the saliency distribution information.
  • Step S730 decode the encoded and compressed sub-video to obtain the playing video.
  • the video processing method mentioned in the embodiment of the present application is applied on the playback terminal, and the encoded and compressed sub-video in the embodiment of the present application is obtained by the video processing method executed by the server in the above embodiment, here No need to go into details, the playback terminal sends a play request of the source to the server, and after the server determines the corresponding source according to the playback request, the playback terminal can receive the encoded and compressed sub-video corresponding to the source sent by the server, and compress the sub-video according to the encoding. After the sub-video is decoded, the playback video is obtained.
  • the original video is a 360-degree panoramic video as an example.
  • the playback terminal in the embodiment of the application is a VR playback terminal, but it does not represent a limitation to the embodiment of the application.
  • the following embodiments may call the original video a panoramic video or a 360-degree panoramic video.
  • the main purpose of the saliency calculation is to obtain the prediction results of user behavior, based on the saliency distribution information obtained by different predictions, to realize the encoding and compression of different sub-videos after division, so that the size of the encoded and compressed sub-videos Corresponding to the characteristics of user behavior, the encoded and compressed sub-videos obtained by the playback terminal have different qualities according to the characteristics of the user's position, so as to realize low-bandwidth video transmission and reduce the consumption of computing and storage resources.
  • the playback request includes the user's selected area on the viewing angle of the playback terminal, and the above step S720 may also include but not limited to step S810 and step S820.
  • Step S810 according to the area corresponding to the selected area, determine the sub-video corresponding to the area of the original video block.
  • Step S820 receiving the encoded and compressed sub-video corresponding to the selected area sent by the server.
  • the playback terminal can download all the sub-videos of the panoramic video from the server, and can also obtain the modified sub-videos according to the user's selected area on the viewing angle of the playback terminal. Area, to determine the sub-video corresponding to the area of the original video block.
  • the selected area can be the area selected by the user, or the area where the user's perspective is located.
  • Correspondingly coded and compressed sub-videos can realize low-bandwidth video transmission and reduce consumption of computing and storage resources.
  • the playback device when the playback device is not a VR playback device, but other types of playback devices such as mobile phones, the user can select a viewing area on the playback device, and use this area as the above-mentioned embodiment. Select an area, so that the playback device can obtain the sub-video in this area for decoding according to the user's selection requirements, so as to realize low-bandwidth video transmission and reduce the consumption of computing and storage resources.
  • Embodiment 1 the embodiment of the present application provides a panoramic video file processing method applied to the on-demand service, and the specific steps are as follows:
  • the complete source of panoramic video for on-demand is injected into the video server, and the encoded compression format and file encapsulation format of the injected film source are not limited.
  • the video encoding format can be AVC/H.264, HEVC/H265, etc.
  • the document encapsulation format can be MP4, MPEG-TS, etc.
  • 1.2) Use the saliency prediction algorithm to calculate the saliency of the 360-degree panoramic video, and obtain a two-dimensional matrix of saliency distribution with the same length and width (resolution) as the panoramic video.
  • a 4K panoramic video with a resolution of 3840x1920 is used to illustrate that the The degree prediction result is a two-dimensional matrix with a size of 3840x1920, and the two-dimensional matrix marks the saliency value of each pixel.
  • the main purpose of the saliency calculation here is to obtain the prediction results of user behavior, and there is no restriction on the specific algorithm. Different saliency prediction algorithms give different prediction accuracy which may affect the final optimization results.
  • step 1.2), step 1.3) and step 1.4) can be calculated for the entire video, or for the first frame or some key frames, and the calculation results are also applied to other image frames in the video.
  • the specific value of the encoding parameter is determined.
  • the encoding parameter here can be the video bit rate, or other video encoding parameters that affect the image quality, such as quantization parameters, key frame intervals, resolution, frame rate, etc.
  • bit rate selection algorithm is used to determine the specific encoding bit rate of each block.
  • the steps of the bit rate selection algorithm are as follows:
  • the minimum code rate corresponding to the sub-video is R low
  • step 1.5.2) and step 1.5.3) can be changed.
  • step 1.5 Decode the original panoramic video, and then perform HEVC tile encoding based on MCTS (Motion Constrained Block Set).
  • the compression code rate is limited to complete the encoding operation. In this example, 12 compressed videos that can be completely independently decoded are obtained through encoding.
  • Encapsulation protocols include but are not limited to HLS, DASH, and MSS, etc., and the block information field is added to the playback index file to mark the block number and location information corresponding to each video file for playback terminal identification.
  • Embodiment 2 the embodiment of the present application provides an interaction process between a VR player terminal and a video server in a panoramic video on demand service, and the specific steps are as follows:
  • the VR playback terminal initiates a request to the video server to play the index file index.mpd, and index.mpd describes the name, storage path, corresponding block number, location and other information of each video file.
  • the video server returns the playback index file index.mpd.
  • the VR playback terminal parses index.mpd, obtains video file information, and initiates a video file download request to the video server. It can download video files corresponding to all segments, or only download files corresponding to segments within the viewing angle range.
  • the video server sends the video file to the VR playback terminal.
  • the VR player terminal reorganizes, decodes and plays the video file.
  • Embodiment 3 the embodiment of the present application provides a panoramic video file processing method applied to the live broadcast service, and the specific steps are as follows:
  • the video server pulls or receives 360-degree panorama live streaming.
  • the coding format of the live stream and the packaging protocol of the streaming transmission are not limited.
  • the video coding protocol can be AVC/H.264, HEVC/H265, etc.
  • the packaging protocol of the streaming transmission It can be RTSP, RTMP, HLS, etc.
  • the degree prediction result is a two-dimensional matrix with a size of 3840x1920, and the two-dimensional matrix marks the saliency value of each pixel.
  • the above steps 3.2), 3.3) and 3.4) can be calculated for a certain key frame or some key frames of the live stream, and the calculation results are also applied to other image frames in the live stream.
  • the specific value of the encoding parameter is determined.
  • the encoding parameter here can be the video bit rate, or other video encoding parameters that affect the image quality, such as quantization parameters, key frame intervals, resolution, frame rate, etc.
  • bit rate selection algorithm is used to determine the specific encoding bit rate of each block.
  • the steps of the bit rate selection algorithm are as follows:
  • the minimum code rate corresponding to the sub-video is R low
  • step 3.5.2) and step 3.5.3) can be changed.
  • the panoramic video live stream in the period is decoded, and then HEVCtile encoding based on MCTS is performed.
  • the compression code rate is limited to complete the encoding operation. In this example, 12 compressed videos that can be completely independently decoded are obtained through encoding.
  • the encapsulation protocols include but are not limited to HLS, DASH and MSS, etc. Add the block information field to the playback index file to mark each video The block number and location information corresponding to the file are used for playback terminal identification.
  • Embodiment 4 provides an interaction process between a VR playback terminal and a video server in a panoramic video live broadcast service. The specific steps are as follows:
  • the VR playback terminal periodically initiates a request to the video server to play the index file index.mpd.
  • index.mpd describes the name, storage path, corresponding block number, location and other information of the video files in the last few cycles.
  • the video server returns the latest playback index file index.mpd.
  • the VR playback terminal parses index.mpd, obtains video file information, and initiates a video file download request to the video server. It can download video files corresponding to all segments, or only download files corresponding to segments within the viewing angle range.
  • the video server sends the video file to the VR playback terminal.
  • the VR player terminal reorganizes, decodes and plays the video file.
  • the server in the above embodiments has video encoding and decoding and storage capabilities, and the playback terminal has panoramic video decoding and playback capabilities.
  • FIG. 9 shows an electronic device 100 provided by an embodiment of the present application.
  • the electronic device 100 includes: a processor 101 , a memory 102 , and a computer program stored on the memory 102 and operable on the processor 101 , and the computer program is used to execute the above video processing method when running.
  • the processor 101 and the memory 102 may be connected through a bus or in other ways.
  • the memory 102 as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer executable programs, such as the video processing method described in the embodiment of the present application.
  • the processor 101 implements the above video processing method by running the non-transitory software programs and instructions stored in the memory 102 .
  • the memory 102 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store and execute the aforementioned video processing method.
  • the memory 102 may include a high-speed random access memory 102, and may also include a non-transitory memory 102, such as at least one storage device, a flash memory device or other non-transitory solid-state storage devices.
  • the memory 102 may optionally include memory 102 remotely located relative to the processor 101 , and these remote memory 102 may be connected to the electronic device 100 through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and instructions required to realize the above-mentioned video processing method are stored in the memory 102.
  • the above-mentioned video processing method is executed, for example, the method steps in FIG. 1 are executed.
  • the embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions, and the computer-executable instructions are used to execute the above-mentioned video processing method.
  • the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more control processors, for example, performing steps S110 to S140 of the method in FIG. Method step S210 to step S220 in, method step S310 to step S330 in Fig. 3, method step S410 to step S440 in Fig. 4, method step S510 to step S520 in Fig. 5, method step S610 to step S520 in Fig. 6 Step S620, the method steps S710 to S730 in FIG. 7, and the method steps S810 to S820 in FIG. 8.
  • the video processing method in the embodiment of the present application can be applied to a server or a playback terminal.
  • the server divides the original video into blocks to obtain multiple sub-videos, and then encodes and compresses the sub-videos according to the saliency distribution information to obtain the encoded and compressed sub-videos, so that the playback terminal can send the sub-videos to the server.
  • the encoded and compressed sub-video corresponding to the source sent by the server is received, and the playback terminal decodes it to obtain the playback video.
  • the different sub-videos after segmentation are encoded and compressed, so that the size of the encoded and compressed sub-videos corresponds to the characteristics of user behavior, so that low-bandwidth video transmission can be realized. At the same time, the consumption of computing and storage resources is reduced.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • Computer storage media including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, storage device storage or other magnetic storage devices, or Any other medium that can be used to store desired information and that can be accessed by a computer.
  • communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Abstract

The present application discloses a video processing method, an electronic device and a storage medium. The video processing method used in a server comprises: acquiring an original video of a film source (S110); calculating the saliency of the original video to obtain information about saliency distribution on the original video (S120); dividing the original video into blocks to obtain multiple sub-videos (S130); and encoding and compressing the sub-videos according to the saliency distribution information (S140).

Description

视频处理方法、电子设备及存储介质Video processing method, electronic device and storage medium
相关申请的交叉引用Cross References to Related Applications
本申请基于申请号为202111239873.4、申请日为2021年10月25日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on a Chinese patent application with application number 202111239873.4 and a filing date of October 25, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请涉及但不限于视觉技术领域,特别是涉及一种视频处理方法、电子设备及存储介质。The present application relates to but not limited to the field of visual technology, and in particular relates to a video processing method, electronic equipment and storage media.
背景技术Background technique
虚拟现实技术(Virtual Reality,VR)发展到现在已经广为人知,而作为虚拟现实相关资源的主要承载形式和虚拟现实用户的主要消费内容,基于VR的360度全景视频被越来越多的用户所接受和消费。相比传统平面视频通常只有一个镜头进行内容的拍摄,而360度全景视频在拍摄时通常由多个镜头同时拍摄,然后对内容进行畸变矫正和拼接后形成完整的全景内容,相关技术中,随着VR的分辨率越来越高,数据量的增加也会相应引入编解码等处理流程中计算量的增加,导致其所占用的传输带宽也越来越高,大大增加了计算和存储资源的消耗,影响视频的观看质量。The development of virtual reality technology (Virtual Reality, VR) has been widely known to the present, and as the main bearing form of virtual reality related resources and the main consumption content of virtual reality users, 360-degree panoramic video based on VR is accepted by more and more users. and consumption. Compared with traditional flat video, there is usually only one lens for content shooting, while 360-degree panoramic video is usually shot by multiple lenses at the same time when shooting, and then the content is corrected for distortion and spliced to form a complete panoramic content. As the resolution of VR is getting higher and higher, the increase in the amount of data will also lead to an increase in the amount of calculation in the process of encoding and decoding, which leads to an increase in the transmission bandwidth occupied by it, which greatly increases the use of computing and storage resources. Consumption, affecting the viewing quality of the video.
发明内容Contents of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.
本申请实施例提供了一种视频处理方法、电子设备及存储介质。Embodiments of the present application provide a video processing method, electronic equipment, and a storage medium.
第一方面,本申请实施例提供了一种视频处理方法,应用于服务器,所述视频处理方法包括:获取片源的原始视频;对所述原始视频进行显著度计算得到所述原始视频上的显著度分布信息;对所述原始视频进行分块得到多个子视频;根据所述显著度分布信息对所述子视频进行编码压缩。In the first aspect, the embodiment of the present application provides a video processing method, which is applied to a server. The video processing method includes: obtaining the original video of the film source; performing saliency calculation on the original video to obtain the saliency distribution information; block the original video to obtain multiple sub-videos; encode and compress the sub-videos according to the saliency distribution information.
第二方面,本申请实施例提供了一种视频处理方法,应用于播放终端,所述视频处理方法包括:向服务器发送片源的播放请求,以使所述服务器根据所述播放请求确定所述片源;接收所述服务器发送的所述片源对应的编码压缩后的子视频,所述编码压缩后的子视频由所述服务器根据所述片源的原始视频进行显著度计算得到所述原始视频上的显著度分布信息,并对所述原始视频进行分块得到多个子视频后,所述服务器根据所述显著度分布信息对所述子视频进行编码压缩得到;根据所述编码压缩后的子视频解码后得到播放视频。In the second aspect, the embodiment of the present application provides a video processing method, which is applied to a playback terminal, and the video processing method includes: sending a playback request of a film source to a server, so that the server determines the film source: receiving the coded and compressed sub-video corresponding to the film source sent by the server, the coded and compressed sub-video is calculated by the server according to the original video of the film source to obtain the original saliency distribution information on the video, and after the original video is divided into blocks to obtain multiple sub-videos, the server encodes and compresses the sub-videos according to the saliency distribution information; according to the encoded and compressed After the sub-video is decoded, the playback video is obtained.
第三方面,本申请实施例提供了一种电子设备,其中,包括:存储器、处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现如本申请第一方面实施例中任意一项所述的视频处理方法或本申请第二方面实施例中任意一项所述的视频处理方法。In the third aspect, the embodiment of the present application provides an electronic device, which includes: a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the implementation of the first aspect of the present application is implemented. The video processing method described in any one of the examples or the video processing method described in any one of the embodiments of the second aspect of the present application.
第四方面,本申请实施例提供了一种计算机可读存储介质,其中,所述存储介质存储有程序,所述程序被处理器执行实现如本申请第一方面实施例中任意一项所述的视频处理方法 或本申请第二方面实施例中任意一项所述的视频处理方法。In the fourth aspect, the embodiment of the present application provides a computer-readable storage medium, wherein the storage medium stores a program, and the program is executed by a processor to implement any one of the embodiments of the first aspect of the application. The video processing method or the video processing method described in any one of the embodiments of the second aspect of the present application.
本申请的其他特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
附图说明Description of drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.
图1是本申请一个实施例提供的视频处理方法的流程示意图;Fig. 1 is a schematic flow chart of a video processing method provided by an embodiment of the present application;
图2是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 2 is a schematic flowchart of a video processing method provided by another embodiment of the present application;
图3是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 3 is a schematic flowchart of a video processing method provided in another embodiment of the present application;
图4是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 4 is a schematic flowchart of a video processing method provided in another embodiment of the present application;
图5是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 5 is a schematic flowchart of a video processing method provided by another embodiment of the present application;
图6是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 6 is a schematic flowchart of a video processing method provided by another embodiment of the present application;
图7是本申请另一个实施例提供的视频处理方法的流程示意图;FIG. 7 is a schematic flowchart of a video processing method provided by another embodiment of the present application;
图8是本申请另一个实施例提供的视频处理方法的流程示意图;Fig. 8 is a schematic flowchart of a video processing method provided by another embodiment of the present application;
图9是本申请一个实施例提供的电子设备的示意图。Fig. 9 is a schematic diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
在本申请的描述中,需要理解的是,涉及到方位描述,例如上、下、前、后、左、右等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本申请和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本申请实施例的限制。In the description of the present application, it should be understood that the orientation descriptions, such as up, down, front, back, left, right, etc. indicated orientations or positional relationships are based on the orientations or positional relationships shown in the drawings, and are only For the convenience of describing the present application and simplifying the description, it does not indicate or imply that the referred device or element must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the embodiments of the present application.
应了解,在本申请实施例的描述中,多个(或多项)的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到“第一”、“第二”等只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。It should be understood that in the description of the embodiments of the present application, multiple (or multiple) means more than two, greater than, less than, exceeding, etc. are understood as not including the original number, and above, below, within, etc. are understood as including the original number. If there is a description of "first", "second", etc., it is only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.
本申请实施例的描述中,除非另有明确的限定,设置、安装、连接等词语应做广义理解,所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本申请实施例中的具体含义。In the description of the embodiments of the present application, unless otherwise clearly defined, terms such as setting, installation, and connection should be understood in a broad sense, and those skilled in the art can reasonably determine the meaning of the above words in the embodiments of the present application in combination with the specific content of the technical solution. Concrete meaning.
VR发展到现在,随着相关的上下游产业链的完善和重点环节的软硬件技术发展,沉浸式虚拟现实技术正在被越来越多的人所熟知。而作为虚拟现实相关资源的主要承载形式和虚拟现实用户的主要消费内容,360度全景视频以4K VR作为起点,8K VR作为未来发展趋势,甚至在将来,会发展到12K VR、24K VR等,对于8K VR,会占用带宽百兆以上,数据量的增加也会相应引入编解码等处理流程中计算量的增加。因此,如何在带宽降低的同时保障用户观看高质量的VR内容,成为了VR应用进一步落地亟需解决的问题。With the development of VR to the present, with the improvement of related upstream and downstream industrial chains and the development of software and hardware technology in key links, immersive virtual reality technology is becoming more and more familiar to people. As the main bearing form of virtual reality related resources and the main consumption content of virtual reality users, 360-degree panoramic video starts with 4K VR, and 8K VR is the future development trend, and even in the future, it will develop to 12K VR, 24K VR, etc. For 8K VR, it will take up more than 100 megabits of bandwidth, and the increase in data volume will also lead to an increase in the amount of calculations in processing processes such as encoding and decoding. Therefore, how to ensure users watch high-quality VR content while reducing bandwidth has become an urgent problem to be solved for the further implementation of VR applications.
基于此,本申请实施例提供了一种视频处理方法、电子设备及存储介质,视频处理方法可应用在服务器和播放终端中,通过显著度计算主要目的是获取对用户行为的预测结果,基于不同预测得到的显著度分布信息实现对分块后不同的子视频进行编码压缩,使得编码压缩后的子视频大小与用户行为的特点相对应,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。Based on this, the embodiment of the present application provides a video processing method, electronic equipment and storage medium. The video processing method can be applied in servers and playback terminals. The predicted saliency distribution information realizes the encoding and compression of different sub-videos after division, so that the size of the encoded and compressed sub-videos corresponds to the characteristics of user behavior, so as to realize low-bandwidth video transmission while reducing calculation and storage resource consumption.
下面进行详细说明。Detailed description will be given below.
本申请实施例提供了一种视频处理方法,应用于服务器,参照图1所示,本申请实施例中的视频处理方法包括但不限于步骤S110、步骤S120、步骤S130和步骤S140。The embodiment of the present application provides a video processing method applied to a server. Referring to FIG. 1 , the video processing method in the embodiment of the present application includes but not limited to step S110, step S120, step S130 and step S140.
步骤S110,获取片源的原始视频。Step S110, acquiring the original video of the film source.
在一实施例中,服务器先获取片源的原始视频,片源可以是用于播放终端播放的点播片源或直播片源,片源可以由任意形式得到,注入的片源编码压缩格式和文件封装格式不受限制,原始视频为未经过本申请中视频处理方法处理之前的视频源,本申请实施例中的服务器可以是视频服务器,原始视频可以是全景视频或其它类型的视频,当原始视频为全景视频时,本申请实施例中的视频处理方法可应用到VR技术领域中,所处理后的视频可以供VR播放终端进行播放,当原始视频为其它类型的视频时,本申请实施例中的视频处理方法所处理后的视频可以供手机、平板电脑、视频播放器等播放终端进行播放,本申请实施例中以原始视频为360度全景视频为例子,但并不表示为对本申请实施例的限制,以下实施例可称原始视频为全景视频或360度全景视频。In one embodiment, the server first obtains the original video of the film source. The film source can be an on-demand film source or a live film source for playback by the playback terminal. The film source can be obtained in any form, and the injected film source codes and compresses the format and file The encapsulation format is not limited, and the original video is the video source that has not been processed by the video processing method in this application. The server in the embodiment of this application can be a video server, and the original video can be panoramic video or other types of video. When the original video When it is a panoramic video, the video processing method in the embodiment of the present application can be applied to the field of VR technology, and the processed video can be played by a VR playback terminal. When the original video is other types of video, in the embodiment of the present application The video processed by the video processing method can be played by playing terminals such as mobile phones, tablet computers, and video players. In the following embodiments, the original video may be referred to as a panoramic video or a 360-degree panoramic video.
步骤S120,对原始视频进行显著度计算得到原始视频上的显著度分布信息。Step S120, performing saliency calculation on the original video to obtain saliency distribution information on the original video.
在一实施例中,服务器利用显著度预测算法对360度全景视频进行显著度计算,得到和全景视频长宽或分辨率对应的显著度分布二维矩阵,需要说明的是,本申请实施例中所得到的显著度分布二维矩阵的长宽或分辨率可以与原始视频相同,并将显著度分布二维矩阵作为原始视频上的显著度分布信息,显著度分布二维矩阵标记了全景视频上每个像素点的显著度值,这里的显著度计算主要目的是获取对用户行为的预测结果,通过本申请实施例中的显著度预测算法可以获取全景视频中每个像素点的显著度值,对具体的算法本申请实施例没有具体限制,可以理解的是,本申请实施例可以根据不同的显著度预测算法给出预测的准确度不同结果,选择匹配的显著度预测算法应用到视频处理方法中。In one embodiment, the server uses a saliency prediction algorithm to calculate the saliency of the 360-degree panoramic video, and obtains a two-dimensional matrix of saliency distribution corresponding to the length, width or resolution of the panoramic video. It should be noted that in the embodiment of the present application The length, width or resolution of the obtained saliency distribution two-dimensional matrix can be the same as the original video, and the saliency distribution two-dimensional matrix is used as the saliency distribution information on the original video, and the saliency distribution two-dimensional matrix marks the panoramic video. The saliency value of each pixel. The main purpose of the saliency calculation here is to obtain the prediction result of user behavior. The saliency prediction algorithm in the embodiment of the application can obtain the saliency value of each pixel in the panoramic video. There is no specific limitation on the specific algorithm in this embodiment of the application. It can be understood that the embodiment of this application can give different prediction accuracy results according to different saliency prediction algorithms, and the matching saliency prediction algorithm is selected and applied to the video processing method. middle.
360度全景视频作为一种新的媒体内容,相比于传统平面视频,也引入了一些新的特性。在视频的采集端,传统平面视频通常只有一个镜头进行内容的拍摄,而360度全景视频在拍摄时通常由多个镜头同时拍摄,然后对内容进行畸变矫正和拼接后形成完整的360度全景内容,而在后续的传输和存储过程中,360度全景内容会被以非均匀的方式映射到平面上进行压缩编码和传输。相比于传统的平面视频,这些处理流程会给360度全景视频引入额外的质量损失,而当视频内容在播放终端进行播放时,用户通常佩戴头戴式显示设备进行观看,和传统平面视频将完整的内容直接呈现于用户的视场中心不同,用户在观看360度全景视频时只能以视场为单位进行局部内容的观看,并且可以依靠转头等动作进行观看区域的自主选择。一方面沉浸式的观看方式隔绝了外界的视觉干扰,而自由度较高和局部可见的特点则决定了用户的质量感知结果更多的会受局部内容的影响。As a new media content, 360-degree panoramic video also introduces some new features compared with traditional flat video. On the video acquisition side, traditional flat video usually has only one lens to shoot content, while 360-degree panoramic video is usually shot by multiple lenses at the same time, and then the content is distorted and spliced to form a complete 360-degree panoramic content , and in the subsequent transmission and storage process, the 360-degree panoramic content will be mapped onto the plane in a non-uniform manner for compression encoding and transmission. Compared with traditional flat video, these processing procedures will introduce additional quality loss to 360-degree panoramic video, and when the video content is played on the playback terminal, users usually wear a head-mounted display device to watch, and traditional flat video will The complete content is directly presented in the center of the user's field of view. When watching a 360-degree panoramic video, the user can only watch partial content in units of field of view, and can independently select the viewing area by turning the head and other actions. On the one hand, the immersive viewing method isolates external visual interference, while the high degree of freedom and the characteristics of local visibility determine that the user's quality perception results will be more affected by local content.
需要说明的是,根据沉浸式用户观看行为分析的结果显示,不同的用户在观看相同的360度全景视频内容时的头动数据呈现出了高度的一致性,即不同的用户倾向于观看的区域和停 留的观看时长都是相似的。更进一步地,在沉浸式观看环境中,当用户经过对整个场景的快速观察后,就会倾向于固定观看某些区域,基于这样的一致性结合前述局部内容对质量感知结果的主要影响,本申请实施例根据用户行为的特点对完整的360度全景视频显著度计算可以得到对用户行为的特点相对应的显著度分布二维矩阵。显著度又叫显著性,是视频或图像中重要的视觉特征,体现了人眼对图像的某些区域的重视程度。对于一个视频来说,用户只对视频中的部分区域感兴趣,或用户倾向于观看的视频中的该区域,这部分感兴趣的区域代表了用户的查询意图,而其他的不感兴趣区域则与用户的查询意图无关,显著度表征的区域是视频中最能引起用户兴趣、最能表现视频内容的区域。It should be noted that according to the results of immersive user viewing behavior analysis, the head movement data of different users when watching the same 360-degree panoramic video content shows a high degree of consistency, that is, the areas that different users tend to watch It is similar to the length of time spent watching. Furthermore, in an immersive viewing environment, after a quick observation of the entire scene, users tend to fixate on certain areas. According to the characteristics of the user behavior, the calculation of the saliency of the complete 360-degree panoramic video can obtain the two-dimensional matrix of the saliency distribution corresponding to the characteristics of the user behavior. Saliency, also known as saliency, is an important visual feature in a video or image, reflecting the degree to which the human eye attaches importance to certain areas of the image. For a video, the user is only interested in some areas in the video, or the area in the video that the user tends to watch. This part of the interested area represents the user's query intention, while other uninterested areas are related to The user's query intention is irrelevant, and the area characterized by salience is the area in the video that most arouses the user's interest and expresses the video content.
步骤S130,对原始视频进行分块得到多个子视频。Step S130, divide the original video into blocks to obtain multiple sub-videos.
在一实施例中,服务器对全景视频进行分块(TILE)得到多个分块上的子视频,服务器可以对全景视频进行大小相同或不同的分块,得到大小相同或不同的多个子视频,例如,服务可根据全景视频中用户视角倾向高的方向,划分出分块大小稍大的子视频,对其他区域的子视频划分的分块可以稍小,可根据实际需要设置,以便提高视频的编码解码效果,提高该分块子视频的播放质量,需要说明的是,分块的数量可以根据实际需要设置,在一实施例中,服务器将分辨率为3840X1920的全景视频按照长宽4x3分块,总共得到12个子视频,每个分块的子视频的分辨率为960x640,服务器可根据带宽的要求,调整所得分块得到的子视频的数量,当上述实施例中所得到的12个子视频的带宽满足设计需求,即可不需要再进行其他数量的分块,服务器可根据实际需要或人工智能算法对分块的数量进行设置,在另一实施例中,分块越多,所得到的子视频越多,能达到的带宽减小效果越明显,但所需要的处理资源会增加,服务器可根据带宽的要求设置相应的分块数量,以达到带宽要求和处理要求,本申请不对其做具体限制。In one embodiment, the server divides the panoramic video into blocks (TILE) to obtain sub-videos on multiple blocks, and the server can divide the panoramic video into blocks of the same or different sizes to obtain multiple sub-videos of the same or different sizes, For example, the service can divide sub-videos with a slightly larger block size according to the direction in which the user's viewing angle tends to be higher in the panoramic video, and the sub-videos in other areas can be divided into smaller blocks, which can be set according to actual needs, so as to improve video quality. Encoding and decoding effect to improve the playback quality of the sub-block sub-video. It should be noted that the number of sub-blocks can be set according to actual needs. In one embodiment, the server divides the panoramic video with a resolution of 3840X1920 into blocks according to the length and width of 4x3 , to obtain 12 sub-videos in total, the resolution of each sub-video is 960x640, and the server can adjust the number of sub-videos obtained by the sub-videos according to the bandwidth requirement, when the 12 sub-videos obtained in the above-mentioned embodiment The bandwidth meets the design requirements, that is, there is no need to perform other numbers of blocks, and the server can set the number of blocks according to actual needs or artificial intelligence algorithms. In another embodiment, the more blocks, the obtained sub-video The more, the more obvious the bandwidth reduction effect can be achieved, but the required processing resources will increase. The server can set the corresponding number of blocks according to the bandwidth requirements to meet the bandwidth requirements and processing requirements. This application does not make specific restrictions on it .
需要说明的是,为了总体减少传输带宽,基于视窗依赖式传输机制是一种必要且可行的全景视频传输方案,视窗依赖式传输机制即以高质量形式传输用户高度关注的部分内容,或用户当前正在观看的视窗内容,而对于暂时不会被看到的其他区域内容,则可适当降低传输质量,甚至不传输。为了实现依赖于用户视窗的全景视频传输,本申请实施例采用分块式流化(Tiled Streaming)策略,主要是在像素域将未压缩的原始高视频分辨率视频逐帧切分成若干个低分辨率的空间分块视频,并将每个分块的子视频编码成独立可解的媒体内容,这样的方案通常需要对各个分块都进行不同质量的编码,在一实施例中,对于用户的观看视角内的区域,播放终端可以从服务器侧下载对应区域的高质量的分块内容,视角外的区域,则下载对应区域的低质量的分块内容。It should be noted that, in order to reduce the overall transmission bandwidth, a window-dependent transmission mechanism is a necessary and feasible panoramic video transmission scheme. The window-dependent transmission mechanism is to transmit some content that the user is highly concerned about in a high-quality form, or the user's current The content of the window that is being watched, and the content of other areas that will not be seen temporarily, the transmission quality can be appropriately reduced, or even not transmitted. In order to realize the panoramic video transmission that depends on the user's window, the embodiment of the present application adopts the Tiled Streaming strategy, which mainly divides the uncompressed original high-resolution video into several low-resolution video frame by frame in the pixel domain. The spatial block video of high rate, and the sub-video of each block is coded into independently decodable media content, and such scheme usually needs to carry out the coding of different quality to each block, in one embodiment, for the user's When watching the area within the viewing angle, the playback terminal can download the high-quality segmented content of the corresponding area from the server side, and download the low-quality segmented content of the corresponding area for the area outside the viewing angle.
需要说明的是,本申请实施例中的分块,可以理解为,在高效率视频编码(High Efficiency Video Coding,HEVC)中一幅图像可以划分为若干个Tile,即从从水平和垂直方向将图像分割为若干个矩形区域,把这些矩形区域称为分块,分块可以由MPEG HEVC编码定义,本申请不对其做具体限制。It should be noted that the block in the embodiment of the present application can be understood as, in High Efficiency Video Coding (HEVC), an image can be divided into several Tile, that is, horizontally and vertically The image is divided into several rectangular areas, and these rectangular areas are called blocks, and the blocks can be defined by MPEG HEVC encoding, which is not specifically limited in this application.
以上步骤S120和S130顺序不分先后,也可以先分块再做显著度计算。The order of the above steps S120 and S130 is not in any particular order, and the saliency calculation can also be done in blocks first.
步骤S140,根据显著度分布信息对子视频进行编码压缩Step S140, encoding and compressing the sub-video according to the saliency distribution information
在一实施例中,服务器根据显著度分布信息对子视频进行编码压缩,可以对所有得到的子视频进行压缩编码,也可以针对部分区域内的子视频进行压缩编码,需要说明的是,HEVC为代表的编码标准中的优化算法是针对传统视频进行设计的,并没有考虑到全景视频的一些 新特性,因此如何对编码器进行针对全景视频的优化从而充分利用服务资源成为了又一个亟需解决的问题。本申请实施例中在分块的基础上对子视频进行非均匀的编码,可以得到不同质量的子视频,其中显著度高的子视频所编码压缩后得到的视频质量高于显著度低的子视频,本申请实施例通过显著度计算主要目的是获取对用户行为的预测结果,基于不同预测得到的显著度分布信息实现对分块后不同的子视频进行编码压缩,使得编码压缩后的子视频大小与用户行为的特点相对应,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。In an embodiment, the server encodes and compresses the sub-videos according to the saliency distribution information. It can compress and encode all the obtained sub-videos, and can also compress and encode the sub-videos in some areas. It should be noted that HEVC is The optimization algorithm in the representative coding standard is designed for traditional video and does not take into account some new features of panoramic video. Therefore, how to optimize the encoder for panoramic video and make full use of service resources has become another urgent problem. The problem. In the embodiment of the present application, sub-videos are non-uniformly coded on the basis of blocks, and sub-videos of different qualities can be obtained, and the video quality obtained after encoding and compressing sub-videos with high salience is higher than that of sub-videos with low salience. For video, the main purpose of this embodiment of the application is to obtain the prediction results of user behavior through saliency calculation. Based on the saliency distribution information obtained by different predictions, the encoding and compression of different sub-videos after division is realized, so that the encoded and compressed sub-videos The size corresponds to the characteristics of user behavior, so that low-bandwidth video transmission can be achieved while reducing the consumption of computing and storage resources.
可以理解的是,在一实施例中的全景视频中,用户倾向于固定观看某些区域内的子视频经过编码压缩后的视频质量较高,而其他位置的子视频编码压缩后的视频质量较低,而用户倾向于固定观看某些区域所占的全景视频的比例小,也就是高质量的子视频数量小于低质量的子视频数量,因此经过本申请实施例中的视频处理方法后,所得到的压缩视频相比原始视频,能够使得视频低带宽传输的同时,降低计算和存储资源的消耗。It can be understood that, in the panoramic video in one embodiment, the user tends to watch the sub-videos in certain areas with higher quality after coding and compression, while the video quality of sub-videos in other positions is lower after coding and compression. low, and the user tends to watch a small proportion of the panoramic video in certain areas, that is, the number of high-quality sub-videos is less than the number of low-quality sub-videos, so after the video processing method in the embodiment of the present application, all Compared with the original video, the obtained compressed video can reduce the consumption of computing and storage resources while transmitting the video with low bandwidth.
参照图2所示,在一实施例中,上述步骤S120中还可以包括但不限于步骤S210和步骤S220。Referring to FIG. 2 , in an embodiment, the above step S120 may also include but not limited to step S210 and step S220.
步骤S210,获取原始视频的首帧或关键帧。Step S210, acquiring the first frame or key frame of the original video.
步骤S220,根据首帧或关键帧对原始视频进行显著度计算,得到首帧或关键帧上的显著度分布信息。Step S220, performing saliency calculation on the original video according to the first frame or key frame, to obtain saliency distribution information on the first frame or key frame.
在一实施例中,服务器所得到的显著度分布信息,是根据原始视频的首帧或关键帧进行显著度计算得到的,服务器根据预设的要求,获取原始视频的首帧或关键帧,关键帧可以是用户或开发者根据实际需要选定的原始视频中最能表征显著度的图像帧,根据首帧或关键帧能得到在该帧处的原始视频的图像截图,服务器对该图进行显著度计算,计算结果应用到全景视频中的其他图像帧。可以理解的是,本申请实施例中的视频处理方法,在原始视频的首帧或关键帧处得到的显著度分布信息,可以表征原始视频上用户的用户行为的预测结果,将得到的显著度分布信息,应用到整个视频的图像帧上,以便进行编码压缩处理,服务器可根据实际需要,周期性或非周期性刷新显著度分布信息,重新获取关键帧进行新的显著度分布信息的计算,本申请不对其做具体限制。In one embodiment, the saliency distribution information obtained by the server is calculated based on the first frame or key frame of the original video, and the server obtains the first frame or key frame of the original video according to the preset requirements. The frame can be the most salient image frame in the original video selected by the user or the developer according to the actual needs. According to the first frame or key frame, the image screenshot of the original video at this frame can be obtained, and the server performs the salient degree calculation, and the calculation result is applied to other image frames in the panoramic video. It can be understood that, in the video processing method in the embodiment of the present application, the saliency distribution information obtained at the first frame or key frame of the original video can represent the prediction result of the user behavior of the user on the original video, and the obtained saliency Distribution information is applied to the image frames of the entire video for encoding and compression processing. The server can refresh the saliency distribution information periodically or aperiodically according to actual needs, and reacquire key frames to calculate new saliency distribution information. This application does not specifically limit it.
参照图3所示,在一实施例中,上述步骤S140中还可以包括但不限于步骤S310至步骤S330。Referring to FIG. 3 , in an embodiment, the above step S140 may also include but not limited to step S310 to step S330.
步骤S310,根据显著度分布信息得到每个子视频所对应的显著度权重值。In step S310, the saliency weight value corresponding to each sub-video is obtained according to the saliency distribution information.
步骤S320,根据显著度权重值得到每个子视频的编码参数。In step S320, the encoding parameters of each sub-video are obtained according to the saliency weight value.
步骤S330,根据编码参数对相应的子视频进行编码压缩。Step S330, encoding and compressing the corresponding sub-video according to the encoding parameters.
在一实施例中,服务器计算每个分块上子视频显著度的相应权重,根据显著度分布信息得到每个子视频上对应的显著度值,并根据显著度分布信息得到总的全景视频上的显著度总数,由子视频上的显著度值在显著度总值上的占比得到每个子视频所对应的显著度权重值,服务器根据每个子视频上显著度权重值的不同分配不同的编码参数,以便根据不同的编码参数对相应的子视频进行,不同的编码参数对子视频进行编码压缩后,可以得到质量不同的视频文件,因此经过本申请实施例中的视频处理方法后,所得到的压缩视频相比原始视频,能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。In one embodiment, the server calculates the corresponding weight of the sub-video saliency on each block, obtains the corresponding saliency value on each sub-video according to the saliency distribution information, and obtains the saliency value on the total panoramic video according to the saliency distribution information. The total number of saliency, the saliency weight value corresponding to each sub-video is obtained from the ratio of the saliency value on the sub-video to the total saliency value, and the server assigns different encoding parameters according to the saliency weight value on each sub-video, In order to carry out corresponding sub-videos according to different encoding parameters, after different encoding parameters encode and compress the sub-videos, video files with different qualities can be obtained, so after the video processing method in the embodiment of the present application, the obtained compressed Compared with the original video, the video can achieve low-bandwidth video transmission while reducing the consumption of computing and storage resources.
参照图4所示,在一实施例中,上述步骤S320中还可以包括但不限于步骤S410至步骤S440。Referring to FIG. 4 , in an embodiment, the above step S320 may also include but not limited to steps S410 to S440.
步骤S410,获取原始视频对应的预分配总质量参数。Step S410, acquiring the pre-allocated total quality parameter corresponding to the original video.
步骤S420,获取子视频对应的预设质量参数。Step S420, acquiring a preset quality parameter corresponding to the sub-video.
步骤S430,根据预分配总质量参数和显著度权重值得到子视频的预分配质量参数。Step S430, obtain the pre-allocation quality parameter of the sub-video according to the pre-allocation total quality parameter and the saliency weight value.
步骤S440,根据预分配质量参数和预设质量参数确定得到子视频的编码参数。In step S440, the encoding parameters of the sub-video are determined according to the pre-assigned quality parameters and the preset quality parameters.
在一实施例中,服务器根据不同的质量参数给子视频设置不同的编码参数,服务器获取全景视频对应的预分配总质量参数,预分配总质量参数为预先设定好的一个参数,为根据实际需要设定的总质量参数,小于原始视频的原质量参数,并获取子视频对应的预设质量参数,预设质量参数可以是编码器对子视频的预设最小值或最大值,随后服务器根据预分配总质量参数和显著度权重值为每个子视频分配得到各个子视频的预分配质量参数,并根据预分配质量参数和预设质量参数之间的关系确定得到子视频的编码参数,可以理解的是,本申请实施例中的质量参数可以是视频码率,也可以是其他影响图像质量的视频编码参数,比如量化参数(QP)、关键帧间隔(GOP)、分辨率、帧率等,服务器对比预分配质量参数和预设质量参数之间的大小关系,得到其中合适的一个作为编码参数。In one embodiment, the server sets different encoding parameters for sub-videos according to different quality parameters, and the server obtains the pre-allocated total quality parameters corresponding to the panoramic video. The total quality parameter that needs to be set is less than the original quality parameter of the original video, and the preset quality parameter corresponding to the sub-video is obtained. The preset quality parameter can be the encoder’s preset minimum or maximum value for the sub-video, and then the server will The pre-assigned total quality parameter and saliency weight value are assigned to each sub-video to obtain the pre-assigned quality parameters of each sub-video, and the encoding parameters of the sub-video are determined according to the relationship between the pre-assigned quality parameter and the preset quality parameter, which can be understood Notably, the quality parameter in the embodiment of the present application may be the video code rate, or other video coding parameters that affect image quality, such as quantization parameter (QP), key frame interval (GOP), resolution, frame rate, etc. The server compares the size relationship between the pre-allocated quality parameter and the preset quality parameter, and obtains an appropriate one as an encoding parameter.
需要说明的是,根据不同的质量参数,服务器得到的编码参数也不同,例如,当质量参数为码率时,预设质量参数即为预设码率,预设码率跟子视频对应的编码器相关,具有一定的限值,预设质量参数大于预分配质量参数时,服务器可以根据预分配质量参数或预设质量参数的大小作为编码参数为子视频进行编码压缩,当预设质量参数小于预分配质量参数时,服务器已经无法为子视频分配更高的质量参数了,因此服务器可以根据预设质量参数的大小作为编码参数为子视频进行编码压缩;又例如,当质量参数为分辨率时,预设质量参数即为预设分辨率,预设分辨率跟所得分块的子视频大小相关,预设质量参数大于预分配质量参数时,服务器可以根据预分配质量参数的大小作为编码参数为子视频进行编码压缩,当预设质量参数小于预分配质量参数时,服务器可以根据预分配质量参数或预设质量参数的大小作为编码参数为子视频进行编码压缩;上述例子仅为示例,并不代表为对本申请实施例的限制。It should be noted that, according to different quality parameters, the encoding parameters obtained by the server are also different. For example, when the quality parameter is the bit rate, the preset quality parameter is the preset bit rate, and the preset bit rate corresponds to the encoding of the sub-video. It is related to the device and has a certain limit. When the preset quality parameter is greater than the pre-allocated quality parameter, the server can use the pre-allocated quality parameter or the size of the preset quality parameter as the encoding parameter to encode and compress the sub-video. When the preset quality parameter is less than When pre-allocating quality parameters, the server can no longer assign higher quality parameters to the sub-video, so the server can encode and compress the sub-video according to the size of the preset quality parameter as an encoding parameter; another example, when the quality parameter is resolution , the preset quality parameter is the preset resolution, and the preset resolution is related to the sub-video size of the obtained block. When the preset quality parameter is greater than the pre-allocated quality parameter, the server can use the size of the pre-allocated quality parameter as the encoding parameter as The sub-video is encoded and compressed. When the preset quality parameter is smaller than the pre-allocated quality parameter, the server can use the pre-allocated quality parameter or the size of the preset quality parameter as the encoding parameter to encode and compress the sub-video; the above examples are only examples and are not intended to Representation is a limitation on the embodiments of the application.
参照图5所示,在一实施例中,上述步骤S320中还可以包括但不限于步骤S510和步骤S520。Referring to FIG. 5 , in an embodiment, the above step S320 may also include but not limited to step S510 and step S520.
步骤S510,根据预分配质量参数和预设质量参数得到差值。In step S510, a difference is obtained according to the pre-allocated quality parameter and the preset quality parameter.
步骤S520,根据差值和预分配总质量参数得到更新后的预分配总质量参数。Step S520, obtaining an updated pre-allocated total quality parameter according to the difference value and the pre-allocated total quality parameter.
在一实施例中,服务器在得到各个子视频的预分配质量参数后,根据预分配质量参数和预设质量参数之间的数值大小得到二者之间的差值,并通过得到的差值和原先的预分配总质量参数,进行计算并更新得到更新后的预分配总质量参数,需要说明的是,预分配总质量参数可以为预先设定好的一个参数,其大小与带宽相对应,再通过差值修正得到更新后的预分配总质量参数,以使得更新后的预分配总质量参数可以更小,实现了视频低带宽传输。可以理解的是,在一实施例中,服务器在更新预分配总质量参数的过程中,若某个子视频的预分配质量参数高于其预设质量参数,随后更新的预分配总质量参数可以提高,但是对于整体的全景视频而言,预分配质量参数高于预设质量参数的子视频数量占比不高,对整体而言,所更新得到的预分配总质量参数相比原先的预分配总质量参数是降低的,在另一实施例中,更新得到的预分配总质量参数可与原先的预分配总质量参数是相同的,但会小于原始视频的原质量参数,本申请实施例不对其做具体限制。In an embodiment, after obtaining the pre-allocated quality parameters of each sub-video, the server obtains the difference between the pre-allocated quality parameters and the preset quality parameters according to the value of the two, and uses the obtained difference and The original pre-allocated total quality parameter is calculated and updated to obtain the updated pre-allocated total quality parameter. It should be noted that the pre-allocated total quality parameter can be a preset parameter whose size corresponds to the bandwidth, and then The updated pre-allocated total quality parameter is obtained through difference correction, so that the updated pre-allocated total quality parameter can be smaller, and low-bandwidth video transmission is realized. It can be understood that, in an embodiment, in the process of updating the pre-allocation total quality parameter, if the pre-allocation quality parameter of a certain sub-video is higher than its preset quality parameter, the subsequently updated pre-allocation total quality parameter can be improved. , but for the overall panoramic video, the proportion of sub-videos whose pre-allocated quality parameters are higher than the preset quality parameters is not high. On the whole, the updated pre-allocated total quality parameters are compared with the original pre-allocated total quality parameters. The quality parameter is reduced. In another embodiment, the updated pre-allocated total quality parameter may be the same as the original pre-allocated total quality parameter, but it will be smaller than the original quality parameter of the original video. This embodiment of the present application does not Make specific restrictions.
在一实施例中,以质量参数为码率为例,采用码率选择算法确定每个子视频的具体编码 码率,码率选择算法步骤如下:In one embodiment, taking the quality parameter as the code rate as an example, a code rate selection algorithm is used to determine the specific encoding code rate of each sub-video, and the code rate selection algorithm steps are as follows:
1)总码率限制为R total,则对应第i个子视频应该分配到的码率为:R i=R totalw i,其中,i表示第i个子视频,w i为第i个子视频对应的显著度权重值。 1) The total code rate is limited to R total , then the code rate that should be assigned to the i-th sub-video is: R i =R total w i , where i represents the i-th sub-video, and w i is the i-th sub-video corresponding Significance weight value.
2)若子视频对应的最低码率为R low,最低码率可以为上述预设质量参数,如果1)计算的码率比最低码率小,即R i<R low,则码率值更新为最低码率,即R i=R low;更新R total,若R i与R low的差值为D i,R total减去D i得到最新的R total2) If the minimum code rate corresponding to the sub-video is R low , the minimum code rate can be the above-mentioned preset quality parameter, if the code rate calculated in 1) is smaller than the minimum code rate, that is, R i <R low , then the code rate value is updated as The lowest code rate, that is, R i =R low ; update R total , if the difference between R i and R low is D i , subtract D i from R total to get the latest R total .
3)若子视频对应的最高码率为R high,最高码率可以为上述预设质量参数,如果1)计算的码率比最高码率大,即R i>R high,则码率值更新为最高码率,即R i=R total;更新R total,若R i与R high的差值为D i,R total加上D i得到最新的R total3) If the highest bit rate corresponding to the sub-video is R high , the highest bit rate can be the above preset quality parameter, if 1) the calculated bit rate is greater than the highest bit rate, that is, R i >R high , then the bit rate value is updated as The highest code rate, that is, R i =R total ; update R total , if the difference between R i and R high is D i , add D i to R total to get the latest R total .
在满足本申请实施例要求的前提下,上述步骤2)和步骤3)中也可以不用差值去更新预分配总质量参数,例如在预设质量参数小于或大于预分配质量参数时,服务器可根据预设质量参数作为相对应的子视频的编码参数,对于其余子视频的编码,预分配总质量参数可以减去已经分配好编码参数的子视频的数值后,根据余下的预分配总质量参数的值和其余子视频的显著度权重值得到其余子视频的预分配质量参数,从而可以得到其余子视频的编码参数;还可以理解的是,例如在预设质量参数大于预分配质量参数时,服务器可根据预分配质量参数作为相对应的子视频的编码参数,并通过上述计算得到的差值更新预分配总质量参数后,对于其余子视频的编码,更新后的预分配总质量参数可以减去已经分配好编码参数的子视频的数值后,根据余下的预分配总质量参数的值和其余子视频的显著度权重值得到其余子视频的预分配质量参数,从而可以得到其余子视频的编码参数,上述例子仅为示例,并不代表为对本申请实施例的限制。Under the premise of meeting the requirements of the embodiments of the present application, the difference between the above steps 2) and 3) may not be used to update the pre-allocated total quality parameter. For example, when the preset quality parameter is less than or greater than the pre-allocated quality parameter, the server may According to the preset quality parameter as the encoding parameter of the corresponding sub-video, for the encoding of the remaining sub-videos, the pre-assigned total quality parameter can be subtracted from the value of the sub-video whose encoding parameters have been allocated, and then the remaining pre-assigned total quality parameter value and the saliency weight values of the remaining sub-videos to obtain the pre-allocated quality parameters of the remaining sub-videos, thereby obtaining the encoding parameters of the remaining sub-videos; it can also be understood that, for example, when the preset quality parameter is greater than the pre-allocated quality parameter, The server can use the pre-allocation quality parameter as the encoding parameter of the corresponding sub-video, and after updating the pre-allocation total quality parameter through the difference obtained from the above calculation, for the encoding of the remaining sub-videos, the updated pre-allocation total quality parameter can be reduced After removing the values of the sub-videos for which the encoding parameters have been allocated, the pre-allocated quality parameters of the remaining sub-videos are obtained according to the value of the remaining pre-allocated total quality parameters and the saliency weight values of the remaining sub-videos, so that the encoding of the remaining sub-videos can be obtained Parameters, the above example is only an example, and does not mean to limit the embodiment of the present application.
参照图6所示,本申请实施例中的视频处理方法还可以包括但不限于步骤S610和步骤S620。Referring to FIG. 6 , the video processing method in the embodiment of the present application may further include but not limited to step S610 and step S620.
步骤S610,获取视频封装协议。Step S610, acquire the video encapsulation protocol.
步骤S620,根据封装协议对编码压缩后的子视频进行流媒体传输封装。Step S620, perform streaming media transmission encapsulation on the encoded and compressed sub-video according to the encapsulation protocol.
在一实施例中,服务器需要对编码后的视频进行流媒体传输封装,服务器可以根据视频播放的设备得到其视频封装协议,根据不同的封装协议对编码压缩后的子视频进行流媒体传输封装,例如部分封装协议需要进行周期性封装的,服务器在一定周期内完成一次对编码压缩后的子视频的封装,封装协议包括但不限定与HLS、DASH和MSS等,本申请实施例不对其做具体限制。In one embodiment, the server needs to carry out streaming media transmission encapsulation to the encoded video, the server can obtain its video encapsulation protocol according to the video playback device, and perform streaming media transmission encapsulation on the encoded and compressed sub-video according to different encapsulation protocols, For example, if some encapsulation protocols need to be encapsulated periodically, the server completes the encapsulation of the encoded and compressed sub-video once within a certain period. The encapsulation protocols include but are not limited to HLS, DASH, and MSS, etc., and the embodiments of this application do not specify them. limit.
在一实施例中,服务器根据编码压缩后的子视频生成各个子视频的播放索引文件,播放索引文件中加入分块的子视频的信息字段,标记每个视频文件对应的分块编号和位置信息,用于播放终端识别。In one embodiment, the server generates playback index files for each sub-video according to the encoded and compressed sub-videos, and adds information fields of sub-videos in blocks to the playback index file, and marks the block number and location information corresponding to each video file , used for player terminal identification.
本申请实施例还提供了一种视频处理方法,应用于播放终端,参照图7所示,本申请实施例中的视频处理方法包括但不限于步骤S710、步骤S720和步骤S730。The embodiment of the present application also provides a video processing method, which is applied to the playback terminal. Referring to FIG. 7 , the video processing method in the embodiment of the present application includes but not limited to step S710, step S720 and step S730.
步骤S710,向服务器发送片源的播放请求,以使服务器根据播放请求确定片源。Step S710, sending a play request of the film source to the server, so that the server determines the film source according to the play request.
步骤S720,接收服务器发送的片源对应的编码压缩后的子视频,编码压缩后的子视频由服务器根据片源的原始视频进行显著度计算得到原始视频上的显著度分布信息,并对原始视频进行分块得到多个子视频后,服务器根据显著度分布信息对子视频进行编码压缩得到。Step S720, receiving the coded and compressed sub-video corresponding to the film source sent by the server, the coded and compressed sub-video is calculated by the server according to the saliency of the original video of the film source to obtain the saliency distribution information on the original video, and the original video After obtaining multiple sub-videos by dividing into blocks, the server encodes and compresses the sub-videos according to the saliency distribution information.
步骤S730,根据编码压缩后的子视频解码后得到播放视频。Step S730, decode the encoded and compressed sub-video to obtain the playing video.
在一实施例中,本申请实施例中提到的视频处理方法应用在播放终端上,本申请实施例中的编码压缩后的子视频由上述实施例中服务器执行的视频处理方法得到,在此不再赘述,播放终端向服务器发送片源的播放请求,服务器根据播放请求确定对应的片源后,播放终端可以接收服务器发送的片源对应的编码压缩后的子视频,并根据编码压缩后的子视频进行解码后得到播放视频,本申请实施例中以原始视频为360度全景视频为例子,因此本申请实施例中的播放终端为VR播放终端,但并不表示为对本申请实施例的限制,以下实施例可称原始视频为全景视频或360度全景视频。本申请实施例中通过显著度计算主要目的是获取对用户行为的预测结果,基于不同预测得到的显著度分布信息实现对分块后不同的子视频进行编码压缩,使得编码压缩后的子视频大小与用户行为的特点相对应,播放终端所得到的编码压缩后的子视频根据用户行位的特点具有不同的质量,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。In one embodiment, the video processing method mentioned in the embodiment of the present application is applied on the playback terminal, and the encoded and compressed sub-video in the embodiment of the present application is obtained by the video processing method executed by the server in the above embodiment, here No need to go into details, the playback terminal sends a play request of the source to the server, and after the server determines the corresponding source according to the playback request, the playback terminal can receive the encoded and compressed sub-video corresponding to the source sent by the server, and compress the sub-video according to the encoding. After the sub-video is decoded, the playback video is obtained. In the embodiment of this application, the original video is a 360-degree panoramic video as an example. Therefore, the playback terminal in the embodiment of the application is a VR playback terminal, but it does not represent a limitation to the embodiment of the application. , the following embodiments may call the original video a panoramic video or a 360-degree panoramic video. In the embodiment of the present application, the main purpose of the saliency calculation is to obtain the prediction results of user behavior, based on the saliency distribution information obtained by different predictions, to realize the encoding and compression of different sub-videos after division, so that the size of the encoded and compressed sub-videos Corresponding to the characteristics of user behavior, the encoded and compressed sub-videos obtained by the playback terminal have different qualities according to the characteristics of the user's position, so as to realize low-bandwidth video transmission and reduce the consumption of computing and storage resources.
参照图8所示,在一实施例中,播放请求包括用户在播放终端视角上的选定区域内,上述步骤S720中还可以包括但不限于步骤S810和步骤S820。Referring to FIG. 8 , in one embodiment, the playback request includes the user's selected area on the viewing angle of the playback terminal, and the above step S720 may also include but not limited to step S810 and step S820.
步骤S810,根据选定区域所对应的区域,确定原始视频分块的区域对应的子视频。Step S810, according to the area corresponding to the selected area, determine the sub-video corresponding to the area of the original video block.
步骤S820,接收服务器发送的与选定区域对应的编码压缩后的子视频。Step S820, receiving the encoded and compressed sub-video corresponding to the selected area sent by the server.
在一实施例中,播放终端可以从服务器下载得到所有全景视频的子视频,也可以根据用户在播放终端视角上的选定区域获取该改与的子视频,播放终端根据选定区域所对应的区域,确定原始视频分块的区域对应的子视频,选定区域可以是用户选择的区域,也可以是用户视角所处的区域,播放终端确定该区域的位置后,从服务器接收与选定区域对应的编码压缩后的子视频,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。In one embodiment, the playback terminal can download all the sub-videos of the panoramic video from the server, and can also obtain the modified sub-videos according to the user's selected area on the viewing angle of the playback terminal. Area, to determine the sub-video corresponding to the area of the original video block. The selected area can be the area selected by the user, or the area where the user's perspective is located. Correspondingly coded and compressed sub-videos can realize low-bandwidth video transmission and reduce consumption of computing and storage resources.
可以理解的是,当播放设备不是VR播放设备,而是如手机等其他类型的播放设备时,用户可以在播放设备上选定某个观看的区域,并将该区域作为上述实施例所说的选定区域,从而使得播放设备能够根据用户的选定要求获取该区域的子视频进行解码,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。It can be understood that when the playback device is not a VR playback device, but other types of playback devices such as mobile phones, the user can select a viewing area on the playback device, and use this area as the above-mentioned embodiment. Select an area, so that the playback device can obtain the sub-video in this area for decoding according to the user's selection requirements, so as to realize low-bandwidth video transmission and reduce the consumption of computing and storage resources.
下面,通过具体的实施例进行说明。In the following, description will be made through specific examples.
实施例一,本申请实施例提供了一种应用于点播业务的全景视频文件处理方法,具体步骤如下:Embodiment 1, the embodiment of the present application provides a panoramic video file processing method applied to the on-demand service, and the specific steps are as follows:
1.1)用于点播的全景视频完整片源注入视频服务器,注入片源编码压缩格式和文件封装格式不受限制,视频编码格式可以是AVC/H.264、HEVC/H265等,文封装格式可以是MP4,MPEG-TS等。1.1) The complete source of panoramic video for on-demand is injected into the video server, and the encoded compression format and file encapsulation format of the injected film source are not limited. The video encoding format can be AVC/H.264, HEVC/H265, etc., and the document encapsulation format can be MP4, MPEG-TS, etc.
1.2)利用显著度预测算法对360度全景视频进行显著度计算,得到和全景视频长宽(分辨率)相同的显著度分布二维矩阵,这里使用一个分辨率3840x1920的4K全景视频来说明,显著度预测结果是3840x1920大小的二维矩阵,二维矩阵标记了每个像素点的显著度值。这里的显著度计算主要目的是获取对用户行为的预测结果,对具体的算法没有限制,不同的显著度预测算法给出预测的准确度不同可能会影响最终的优化结果。1.2) Use the saliency prediction algorithm to calculate the saliency of the 360-degree panoramic video, and obtain a two-dimensional matrix of saliency distribution with the same length and width (resolution) as the panoramic video. Here, a 4K panoramic video with a resolution of 3840x1920 is used to illustrate that the The degree prediction result is a two-dimensional matrix with a size of 3840x1920, and the two-dimensional matrix marks the saliency value of each pixel. The main purpose of the saliency calculation here is to obtain the prediction results of user behavior, and there is no restriction on the specific algorithm. Different saliency prediction algorithms give different prediction accuracy which may affect the final optimization results.
1.3)将分辨率为3840x1920的全景视频按照长宽4x3分块,总共12个分块,每个分块的分辨率为960x640,实际应用中,分块的大小可以相同,也可以不同。1.3) Divide the panoramic video with a resolution of 3840x1920 into blocks of 4x3 in length and width, a total of 12 blocks, and the resolution of each block is 960x640. In practical applications, the size of the blocks can be the same or different.
以上步骤1.2)和步骤1.3)顺序不分先后,也可以先分块再做显著度计算。The above steps 1.2) and 1.3) are in no particular order, and the saliency calculation can also be done in blocks first.
1.4)计算每个分块的相应权重,分块权重计算为对应的960x640分辨率大小的矩形框所 覆盖的显著度值之和比上3840x1920范围内的所有显著度值之和。1.4) Calculate the corresponding weight of each block, and the block weight is calculated as the sum of the saliency values covered by the corresponding 960x640 resolution rectangular box than the sum of all saliency values in the range of 3840x1920.
以上,步骤1.2)、步骤1.3)和步骤1.4)步骤,可以是针对整个视频进行计算,也可以是针对首帧或部分关键帧进行计算,计算结果同样应用去视频中的其他图像帧。Above, step 1.2), step 1.3) and step 1.4) can be calculated for the entire video, or for the first frame or some key frames, and the calculation results are also applied to other image frames in the video.
1.5)编码参数具体值确定,这里的编码参数可以是视频码率,也可以是其他影响图像质量的视频编码参数,比如量化参数、关键帧间隔、分辨率、帧率等。1.5) The specific value of the encoding parameter is determined. The encoding parameter here can be the video bit rate, or other video encoding parameters that affect the image quality, such as quantization parameters, key frame intervals, resolution, frame rate, etc.
以码率为例,采用码率选择算法确定每个分块的具体编码码率,码率选择算法步骤如下:Taking the bit rate as an example, the bit rate selection algorithm is used to determine the specific encoding bit rate of each block. The steps of the bit rate selection algorithm are as follows:
1.5.1)总码率限制为R total,则对应第i个子视频应该分配到的码率为:R i=R totalw i,其中,i表示第i个子视频,w i为第i个子视频对应的显著度权重值。 1.5.1) The total code rate is limited to R total , then the code rate that should be assigned to the i-th sub-video is: R i =R total w i , where i represents the i-th sub-video, and w i is the i-th sub-video Corresponding significance weight value.
1.5.2)若子视频对应的最低码率为R low,最低码率可以为上述预设质量参数,如果1)计算的码率比最低码率小,即R i<R low,则码率值更新为最低码率,即R i=R low;更新R total,若R i与R low的差值为D i,R total减去D i得到最新的R total1.5.2) If the minimum code rate corresponding to the sub-video is R low , the minimum code rate can be the above preset quality parameter, if the code rate calculated in 1) is smaller than the minimum code rate, that is, R i <R low , then the code rate value Update to the lowest code rate, that is, R i =R low ; update R total , if the difference between R i and R low is D i , subtract D i from R total to get the latest R total .
1.5.3)若子视频对应的最高码率为R high,最高码率可以为上述预设质量参数,如果1)计算的码率比最高码率大,即R i>R high,则码率值更新为最高码率,即R i=R total;更新R total,若R i与R high的差值为D i,R total加上D i得到最新的R total1.5.3) If the sub-video corresponds to the highest bit rate R high , the highest bit rate can be the above preset quality parameter, if the calculated bit rate in 1) is greater than the highest bit rate, that is, R i >R high , then the bit rate value Update to the highest code rate, that is, R i =R total ; update R total , if the difference between R i and R high is D i , add D i to R total to get the latest R total .
步骤1.5.2)和步骤1.5.3)顺序可调换。The order of step 1.5.2) and step 1.5.3) can be changed.
1.6)对原始全景视频进行解码,再进行基于MCTS(运动约束分块集)的HEVC tile编码。在编码过程中,根据步骤1.5)中得到的对进行压缩码率限制,完成编码操作,本例中,经过编码得到12个可以完全独立解码的压缩视频。1.6) Decode the original panoramic video, and then perform HEVC tile encoding based on MCTS (Motion Constrained Block Set). In the encoding process, according to step 1.5), the compression code rate is limited to complete the encoding operation. In this example, 12 compressed videos that can be completely independently decoded are obtained through encoding.
1.7)对编码后的视频,进行流式传输封装,生成播放索引文件。封装协议包括但不限定与HLS、DASH和MSS等,播放索引文件中加入分块信息字段,标记每个视频文件对应的分块编号和位置信息,用于播放终端识别。1.7) Streaming and encapsulating the encoded video to generate a playback index file. Encapsulation protocols include but are not limited to HLS, DASH, and MSS, etc., and the block information field is added to the playback index file to mark the block number and location information corresponding to each video file for playback terminal identification.
实施例二,本申请实施例提供了一种全景视频点播业务中VR播放终端和视频服务器的交互流程,具体步骤如下:Embodiment 2, the embodiment of the present application provides an interaction process between a VR player terminal and a video server in a panoramic video on demand service, and the specific steps are as follows:
2.1)VR播放终端向视频服务器发起播放索引文件index.mpd请求,index.mpd描述了每个视频文件的名称、存放路径和对应的分块编号、位置等信息。2.1) The VR playback terminal initiates a request to the video server to play the index file index.mpd, and index.mpd describes the name, storage path, corresponding block number, location and other information of each video file.
2.2)视频服务器返回播放索引文件index.mpd。2.2) The video server returns the playback index file index.mpd.
2.3)VR播放终端解析index.mpd,获取视频文件信息,向视频服务器发起视频文件下载请求,可以下载所有分块对应的视频文件,也可以只下载视角范围内的分块对应的文件。2.3) The VR playback terminal parses index.mpd, obtains video file information, and initiates a video file download request to the video server. It can download video files corresponding to all segments, or only download files corresponding to segments within the viewing angle range.
2.4)视频服务器发送视频文件给VR播放终端。2.4) The video server sends the video file to the VR playback terminal.
2.5)VR播放终端对视频文件进行空间重组、解码并播放。2.5) The VR player terminal reorganizes, decodes and plays the video file.
实施例三,本申请实施例提供了一种应用于直播业务的全景视频文件处理方法,具体步骤如下:Embodiment 3, the embodiment of the present application provides a panoramic video file processing method applied to the live broadcast service, and the specific steps are as follows:
3.1)视频服务器拉取或接收360度全景直播码流,直播流编码格式和码流传输封装协议不受限制,视频编码协议可以是AVC/H.264、HEVC/H265等,码流传输封装协议可以是RTSP、RTMP、HLS等。3.1) The video server pulls or receives 360-degree panorama live streaming. The coding format of the live stream and the packaging protocol of the streaming transmission are not limited. The video coding protocol can be AVC/H.264, HEVC/H265, etc., and the packaging protocol of the streaming transmission It can be RTSP, RTMP, HLS, etc.
3.2)利用显著度预测算法对全景视频直播流进行显著度计算,得到和全景视频长宽(分辨率)相同的显著度分布二维矩阵,这里使用一个分辨率3840x1920的4K全景视频来说明,显著度预测结果是3840x1920大小的二维矩阵,二维矩阵标记了每个像素点的显著度值。3.2) Use the saliency prediction algorithm to calculate the saliency of the panoramic video live stream, and obtain the same saliency distribution two-dimensional matrix as the panoramic video length and width (resolution). The degree prediction result is a two-dimensional matrix with a size of 3840x1920, and the two-dimensional matrix marks the saliency value of each pixel.
3.3)将分辨率为3840x1920的全景视频按照长宽4x3分块,总共12个分块,每个分块 的分辨率为960x640。实际应用中,分块的大小可以相同,也可以不同。3.3) Divide the panoramic video with a resolution of 3840x1920 into blocks of 4x3 in length and width, a total of 12 blocks, and the resolution of each block is 960x640. In practical applications, the sizes of the blocks may be the same or different.
以上步骤3.2)和步骤3.3)顺序不分先后,也可以先分块再做显著度计算。The above steps 3.2) and 3.3) are in no particular order, and the saliency calculation can also be done in blocks first.
3.4)计算每个分块的相应权重,分块权重计算为对应的960x640分辨率大小的矩形框所覆盖的显著度值之和比上3840x1920范围内的所有显著度值之和。3.4) Calculate the corresponding weight of each block, and the block weight is calculated as the sum of the saliency values covered by the corresponding 960x640 resolution rectangle box than the sum of all saliency values in the range of 3840x1920.
以上步骤3.2)、步骤3.3)和步骤3.4),可以针对直播流的某个关键帧或部分关键帧进行计算,计算结果同样应用与直播流中的其他图像帧。The above steps 3.2), 3.3) and 3.4) can be calculated for a certain key frame or some key frames of the live stream, and the calculation results are also applied to other image frames in the live stream.
3.5)编码参数具体值确定,这里的编码参数可以是视频码率,也可以是其他影响图像质量的视频编码参数,比如量化参数、关键帧间隔、分辨率、帧率等。3.5) The specific value of the encoding parameter is determined. The encoding parameter here can be the video bit rate, or other video encoding parameters that affect the image quality, such as quantization parameters, key frame intervals, resolution, frame rate, etc.
以码率为例,采用码率选择算法确定每个分块的具体编码码率,码率选择算法步骤如下:Taking the bit rate as an example, the bit rate selection algorithm is used to determine the specific encoding bit rate of each block. The steps of the bit rate selection algorithm are as follows:
3.5.1)总码率限制为R total,则对应第i个子视频应该分配到的码率为:R i=R totalw i,其中,i表示第i个子视频,w i为第i个子视频对应的显著度权重值。 3.5.1) The total code rate is limited to R total , then the code rate that should be assigned to the i-th sub-video is: R i =R total w i , where i represents the i-th sub-video, and w i is the i-th sub-video Corresponding significance weight value.
3.5.2)若子视频对应的最低码率为R low,最低码率可以为上述预设质量参数,如果1)计算的码率比最低码率小,即R i<R low,则码率值更新为最低码率,即R i=R low;更新R total,若R i与R low的差值为D i,R total减去D i得到最新的R total3.5.2) If the minimum code rate corresponding to the sub-video is R low , the minimum code rate can be the above preset quality parameter, if the code rate calculated in 1) is smaller than the minimum code rate, that is, R i <R low , then the code rate value Update to the lowest code rate, that is, R i =R low ; update R total , if the difference between R i and R low is D i , subtract D i from R total to get the latest R total .
3.5.3)若子视频对应的最高码率为R high,最高码率可以为上述预设质量参数,如果1)计算的码率比最高码率大,即R i>R high,则码率值更新为最高码率,即R i=R total;更新R total,若R i与R high的差值为D i,R total加上D i得到最新的R total3.5.3) If the sub-video corresponds to the highest bit rate R high , the highest bit rate can be the above preset quality parameter, if the calculated bit rate in 1) is greater than the highest bit rate, that is, R i >R high , then the bit rate value Update to the highest code rate, that is, R i =R total ; update R total , if the difference between R i and R high is D i , add D i to R total to get the latest R total .
步骤3.5.2)和步骤3.5.3)顺序可调换。The order of step 3.5.2) and step 3.5.3) can be changed.
3.6)每间隔一个周期,对周期内的全景视频直播流进行解码,再进行基于MCTS的HEVCtile编码。在编码过程中,根据步骤3.5)中得到的对进行压缩码率限制,完成编码操作,本例中,经过编码得到12个可以完全独立解码的压缩视频。3.6) At each interval, the panoramic video live stream in the period is decoded, and then HEVCtile encoding based on MCTS is performed. In the encoding process, according to step 3.5), the compression code rate is limited to complete the encoding operation. In this example, 12 compressed videos that can be completely independently decoded are obtained through encoding.
3.7)对上述周期内编码后的视频,进行流式切片封装,生成播放索引文件,封装协议包括但不限定与HLS、DASH和MSS等,播放索引文件中加入分块信息字段,标记每个视频文件对应的分块编号和位置信息,用于播放终端识别。3.7) Perform streaming slice encapsulation on the encoded video in the above period to generate playback index files. The encapsulation protocols include but are not limited to HLS, DASH and MSS, etc. Add the block information field to the playback index file to mark each video The block number and location information corresponding to the file are used for playback terminal identification.
实施例四,本申请实施例提供了一种全景视频直播业务中VR播放终端和视频服务器的交互流程,具体步骤如下:Embodiment 4. The embodiment of the present application provides an interaction process between a VR playback terminal and a video server in a panoramic video live broadcast service. The specific steps are as follows:
4.1)VR播放终端周期性地向视频服务器发起播放索引文件index.mpd请求,index.mpd描述了最近几个周期的视频文件的名称、存放路径和对应的分块编号、位置等信息。4.1) The VR playback terminal periodically initiates a request to the video server to play the index file index.mpd. index.mpd describes the name, storage path, corresponding block number, location and other information of the video files in the last few cycles.
4.2)视频服务器返回最新的播放索引文件index.mpd。4.2) The video server returns the latest playback index file index.mpd.
4.3)VR播放终端解析index.mpd,获取视频文件信息,向视频服务器发起视频文件下载请求,可以下载所有分块对应的视频文件,也可以只下载视角范围内的分块对应的文件。4.3) The VR playback terminal parses index.mpd, obtains video file information, and initiates a video file download request to the video server. It can download video files corresponding to all segments, or only download files corresponding to segments within the viewing angle range.
4.4)视频服务器发送视频文件给VR播放终端。4.4) The video server sends the video file to the VR playback terminal.
4.5)VR播放终端对视频文件进行空间重组、解码并播放。4.5) The VR player terminal reorganizes, decodes and plays the video file.
上述实施例中的服务器具备视频编解码和存储能力,播放终端具备全景视频解码和播放能力。The server in the above embodiments has video encoding and decoding and storage capabilities, and the playback terminal has panoramic video decoding and playback capabilities.
图9示出了本申请实施例提供的电子设备100。电子设备100包括:处理器101、存储器102及存储在存储器102上并可在处理器101上运行的计算机程序,计算机程序运行时用于执行上述的视频处理方法。FIG. 9 shows an electronic device 100 provided by an embodiment of the present application. The electronic device 100 includes: a processor 101 , a memory 102 , and a computer program stored on the memory 102 and operable on the processor 101 , and the computer program is used to execute the above video processing method when running.
处理器101和存储器102可以通过总线或者其他方式连接。The processor 101 and the memory 102 may be connected through a bus or in other ways.
存储器102作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序,如本申请实施例描述的视频处理方法。处理器101通过运行存储在存储器102中的非暂态软件程序以及指令,从而实现上述的视频处理方法。The memory 102, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer executable programs, such as the video processing method described in the embodiment of the present application. The processor 101 implements the above video processing method by running the non-transitory software programs and instructions stored in the memory 102 .
存储器102可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储执行上述的视频处理方法。此外,存储器102可以包括高速随机存取存储器102,还可以包括非暂态存储器102,例如至少一个储存设备存储器件、闪存器件或其他非暂态固态存储器件。在一些实施方式中,存储器102可选包括相对于处理器101远程设置的存储器102,这些远程存储器102可以通过网络连接至该电子设备100。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 102 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store and execute the aforementioned video processing method. In addition, the memory 102 may include a high-speed random access memory 102, and may also include a non-transitory memory 102, such as at least one storage device, a flash memory device or other non-transitory solid-state storage devices. In some implementations, the memory 102 may optionally include memory 102 remotely located relative to the processor 101 , and these remote memory 102 may be connected to the electronic device 100 through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
实现上述的视频处理方法所需的非暂态软件程序以及指令存储在存储器102中,当被一个或者多个处理器101执行时,执行上述的视频处理方法,例如,执行图1中的方法步骤S110至步骤S140、图2中的方法步骤S210至步骤S220、图3中的方法步骤S310至步骤S330、图4中的方法步骤S410至步骤S440、图5中的方法步骤S510至步骤S520、图6中的方法步骤S610至步骤S620、图7中的方法步骤S710至步骤S730、图8中的方法步骤S810至步骤S820。The non-transitory software programs and instructions required to realize the above-mentioned video processing method are stored in the memory 102. When executed by one or more processors 101, the above-mentioned video processing method is executed, for example, the method steps in FIG. 1 are executed. S110 to step S140, method step S210 to step S220 in Fig. 2, method step S310 to step S330 in Fig. 3, method step S410 to step S440 in Fig. 4, method step S510 to step S520 in Fig. 5, Fig. Steps S610 to S620 of the method in FIG. 6 , steps S710 to S730 of the method in FIG. 7 , and steps S810 to S820 of the method in FIG. 8 .
本申请实施例还提供了计算机可读存储介质,存储有计算机可执行指令,计算机可执行指令用于执行上述的视频处理方法。The embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions, and the computer-executable instructions are used to execute the above-mentioned video processing method.
在一实施例中,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个或多个控制处理器执行,例如,执行图1中的方法步骤S110至步骤S140、图2中的方法步骤S210至步骤S220、图3中的方法步骤S310至步骤S330、图4中的方法步骤S410至步骤S440、图5中的方法步骤S510至步骤S520、图6中的方法步骤S610至步骤S620、图7中的方法步骤S710至步骤S730、图8中的方法步骤S810至步骤S820。In one embodiment, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more control processors, for example, performing steps S110 to S140 of the method in FIG. Method step S210 to step S220 in, method step S310 to step S330 in Fig. 3, method step S410 to step S440 in Fig. 4, method step S510 to step S520 in Fig. 5, method step S610 to step S520 in Fig. 6 Step S620, the method steps S710 to S730 in FIG. 7, and the method steps S810 to S820 in FIG. 8.
本申请实施例至少包括以下有益效果:本申请实施例中的视频处理方法,可应用在服务器或播放终端上,服务器先获取片源的原始视频,对原始视频进行显著度计算得到原始视频上的显著度分布信息,随后服务器对原始视频进行分块得到分块后的多个子视频,再根据显著度分布信息对子视频进行编码压缩,得到编码压缩后的子视频,以便播放终端向服务器发送片源的播放请求后,接收服务器发送的片源对应的编码压缩后的子视频,播放终端对其进行解码后即可得到播放视频,本申请实施例中通过显著度计算主要目的是获取对用户行为的预测结果,基于不同预测得到的显著度分布信息实现对分块后不同的子视频进行编码压缩,使得编码压缩后的子视频大小与用户行为的特点相对应,从而能够实现视频低带宽传输的同时,降低计算和存储资源的消耗。The embodiment of the present application at least includes the following beneficial effects: the video processing method in the embodiment of the present application can be applied to a server or a playback terminal. Then the server divides the original video into blocks to obtain multiple sub-videos, and then encodes and compresses the sub-videos according to the saliency distribution information to obtain the encoded and compressed sub-videos, so that the playback terminal can send the sub-videos to the server. After receiving the playback request from the source, the encoded and compressed sub-video corresponding to the source sent by the server is received, and the playback terminal decodes it to obtain the playback video. Based on the saliency distribution information obtained by different predictions, the different sub-videos after segmentation are encoded and compressed, so that the size of the encoded and compressed sub-videos corresponds to the characteristics of user behavior, so that low-bandwidth video transmission can be realized. At the same time, the consumption of computing and storage resources is reduced.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介 质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、储存设备存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包括计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those skilled in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware and an appropriate combination thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, storage device storage or other magnetic storage devices, or Any other medium that can be used to store desired information and that can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
还应了解,本申请实施例提供的各种实施方式可以任意进行组合,以实现不同的技术效果。It should also be understood that the various implementation manners provided in the embodiments of the present application may be combined arbitrarily to achieve different technical effects.
以上是对本申请的一些实施进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请精神的共享条件下还可作出种种等同的变形或替换,这些等同的变形或替换均包括在本申请权利要求所限定的范围内。The above is a specific description of some implementations of the present application, but the present application is not limited to the above-mentioned embodiments. Those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the present application. Equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims (10)

  1. 一种视频处理方法,应用于服务器,所述视频处理方法包括:A video processing method applied to a server, the video processing method comprising:
    获取片源的原始视频;Get the original video of the film source;
    对所述原始视频进行显著度计算得到所述原始视频上的显著度分布信息;performing saliency calculation on the original video to obtain saliency distribution information on the original video;
    对所述原始视频进行分块得到多个子视频;Blocking the original video to obtain a plurality of sub-videos;
    根据所述显著度分布信息对所述子视频进行编码压缩。Encoding and compressing the sub-video is performed according to the saliency distribution information.
  2. 根据权利要求1所述的视频处理方法,其中,所述对所述原始视频进行显著度计算得到所述原始视频上的显著度分布信息,包括:The video processing method according to claim 1, wherein said performing saliency calculation on said original video to obtain saliency distribution information on said original video comprises:
    获取所述原始视频的首帧或关键帧;Obtain the first frame or key frame of the original video;
    根据所述首帧或所述关键帧对所述原始视频进行显著度计算,得到所述首帧或所述关键帧上的所述显著度分布信息。Performing saliency calculation on the original video according to the first frame or the key frame to obtain the saliency distribution information on the first frame or the key frame.
  3. 根据权利要求1或2所述的视频处理方法,其中,所述根据所述显著度分布信息对所述子视频进行编码压缩,包括:The video processing method according to claim 1 or 2, wherein said encoding and compressing said sub-video according to said saliency distribution information comprises:
    根据所述显著度分布信息得到每个所述子视频所对应的显著度权重值;obtaining a saliency weight value corresponding to each of the sub-videos according to the saliency distribution information;
    根据所述显著度权重值得到每个所述子视频的编码参数;Obtain encoding parameters of each sub-video according to the saliency weight value;
    根据所述编码参数对相应的所述子视频进行编码压缩。Encoding and compressing the corresponding sub-videos according to the encoding parameters.
  4. 根据权利要求3所述的视频处理方法,其中,所述根据所述显著度权重值得到所述子视频的编码参数,包括:The video processing method according to claim 3, wherein said obtaining the coding parameters of said sub-video according to said saliency weight value comprises:
    获取所述原始视频对应的预分配总质量参数;Obtain the pre-allocated total quality parameter corresponding to the original video;
    获取所述子视频对应的预设质量参数;Acquiring preset quality parameters corresponding to the sub-video;
    根据所述预分配总质量参数和所述显著度权重值得到所述子视频的预分配质量参数;Obtaining a pre-allocated quality parameter of the sub-video according to the pre-allocated total quality parameter and the saliency weight value;
    根据所述预分配质量参数和所述预设质量参数确定得到所述子视频的所述编码参数。The encoding parameter of the sub-video is obtained by determining according to the pre-allocated quality parameter and the preset quality parameter.
  5. 根据权利要求4所述的视频处理方法,其中,所述根据所述显著度权重值得到所述子视频的编码参数,还包括:The video processing method according to claim 4, wherein said obtaining the encoding parameters of said sub-video according to said saliency weight value further comprises:
    根据所述预分配质量参数和所述预设质量参数得到差值;obtaining a difference according to the pre-allocated quality parameter and the preset quality parameter;
    根据所述差值和所述预分配总质量参数得到更新后的所述预分配总质量参数。The updated pre-allocated total quality parameter is obtained according to the difference value and the pre-allocated total quality parameter.
  6. 根据权利要求1所述的视频处理方法,其中,所述视频处理方法还包括:The video processing method according to claim 1, wherein the video processing method further comprises:
    获取视频封装协议;Obtain the video encapsulation protocol;
    根据所述封装协议对编码压缩后的所述子视频进行流媒体传输封装。Perform streaming media transmission encapsulation on the encoded and compressed sub-video according to the encapsulation protocol.
  7. 一种视频处理方法,应用于播放终端,所述视频处理方法包括:A video processing method applied to a playback terminal, the video processing method comprising:
    向服务器发送片源的播放请求,以使所述服务器根据所述播放请求确定所述片源;Sending a play request of the film source to the server, so that the server determines the film source according to the play request;
    接收所述服务器发送的所述片源对应的编码压缩后的子视频,所述编码压缩后的子视频由所述服务器根据所述片源的原始视频进行显著度计算得到所述原始视频上的显著度分布信息,并对所述原始视频进行分块得到多个子视频后,所述服务器根据所述显著度分布信息对所述子视频进行编码压缩得到;receiving the coded and compressed sub-video corresponding to the film source sent by the server, the coded and compressed sub-video is calculated by the server according to the original video of the film source to obtain the saliency distribution information, and after the original video is divided into blocks to obtain a plurality of sub-videos, the server encodes and compresses the sub-videos according to the saliency distribution information;
    根据所述编码压缩后的子视频解码后得到播放视频。The sub-video compressed according to the encoding is decoded to obtain a playing video.
  8. 根据权利要求7所述的视频处理方法,其中,所述播放请求包括用户在所述播放终端视角上的选定区域内,所述接收所述服务器发送的所述片源对应的编码压缩后的子视频,包 括:The video processing method according to claim 7, wherein the playback request includes receiving the encoded and compressed video file corresponding to the film source sent by the server in the selected area on the viewing angle of the playback terminal. sub-videos, including:
    根据所述选定区域所对应的区域,确定所述原始视频分块的区域对应的所述子视频;determining the sub-video corresponding to the area of the original video block according to the area corresponding to the selected area;
    接收所述服务器发送的与所述选定区域对应的所述编码压缩后的子视频。receiving the coded and compressed sub-video corresponding to the selected area sent by the server.
  9. 一种电子设备,其中,包括:存储器、处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时如实现权利要求1至6中任意一项所述的视频处理方法或权利要求7至8中任意一项所述的视频处理方法。An electronic device, including: a memory, a processor, the memory stores a computer program, and when the processor executes the computer program, the video processing method or the video processing method described in any one of claims 1 to 6 is implemented. The video processing method described in any one of claims 7 to 8.
  10. 一种计算机可读存储介质,其中,所述存储介质存储有程序,所述程序被处理器执行实现如权利要求1至6中任意一项所述的视频处理方法或权利要求7至8中任意一项所述的视频处理方法。A computer-readable storage medium, wherein the storage medium stores a program, and the program is executed by a processor to implement the video processing method according to any one of claims 1 to 6 or any one of claims 7 to 8 One described video processing method.
PCT/CN2022/114283 2021-10-25 2022-08-23 Video processing method, electronic device and storage medium WO2023071469A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111239873.4A CN116033180A (en) 2021-10-25 2021-10-25 Video processing method, electronic device and storage medium
CN202111239873.4 2021-10-25

Publications (1)

Publication Number Publication Date
WO2023071469A1 true WO2023071469A1 (en) 2023-05-04

Family

ID=86072842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/114283 WO2023071469A1 (en) 2021-10-25 2022-08-23 Video processing method, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN116033180A (en)
WO (1) WO2023071469A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117579843B (en) * 2024-01-17 2024-04-02 淘宝(中国)软件有限公司 Video coding processing method and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140153651A1 (en) * 2011-07-19 2014-06-05 Thomson Licensing Method and apparatus for reframing and encoding a video signal
US20140269901A1 (en) * 2013-03-13 2014-09-18 Magnum Semiconductor, Inc. Method and apparatus for perceptual macroblock quantization parameter decision to improve subjective visual quality of a video signal
WO2020091872A1 (en) * 2018-10-29 2020-05-07 University Of Washington Saliency-based video compression systems and methods
CN112055263A (en) * 2020-09-08 2020-12-08 西安交通大学 360-degree video streaming transmission system based on significance detection
CN112637596A (en) * 2020-12-21 2021-04-09 中国科学院国家空间科学中心 Code rate control system
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140153651A1 (en) * 2011-07-19 2014-06-05 Thomson Licensing Method and apparatus for reframing and encoding a video signal
US20140269901A1 (en) * 2013-03-13 2014-09-18 Magnum Semiconductor, Inc. Method and apparatus for perceptual macroblock quantization parameter decision to improve subjective visual quality of a video signal
WO2020091872A1 (en) * 2018-10-29 2020-05-07 University Of Washington Saliency-based video compression systems and methods
CN112055263A (en) * 2020-09-08 2020-12-08 西安交通大学 360-degree video streaming transmission system based on significance detection
CN112637596A (en) * 2020-12-21 2021-04-09 中国科学院国家空间科学中心 Code rate control system
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Also Published As

Publication number Publication date
CN116033180A (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110036641B (en) Method, device and computer readable storage medium for processing video data
EP3510744B1 (en) Methods and apparatus to reduce latency for 360-degree viewport adaptive streaming
TWI712313B (en) Systems and methods of signaling of regions of interest
TWI712309B (en) Enhanced signaling of regions of interest in container files and video bitstreams
US11438600B2 (en) Immersive media metrics for virtual reality content with multiple viewpoints
CN107634930B (en) Method and device for acquiring media data
CN110035331B (en) Media information processing method and device
CN113557741B (en) Method and apparatus for adaptive streaming of point clouds
CN109963176B (en) Video code stream processing method and device, network equipment and readable storage medium
US20200228837A1 (en) Media information processing method and apparatus
CN103607667A (en) A slicing method for SVC video files in a P2P streaming media system
US9258622B2 (en) Method of accessing a spatio-temporal part of a video sequence of images
CN115398481A (en) Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding on image
WO2023071469A1 (en) Video processing method, electronic device and storage medium
US20110228166A1 (en) method and device for determining the value of a delay to be applied between sending a first dataset and sending a second dataset
US20240080487A1 (en) Method, apparatus for processing media data, computer device and storage medium
US20140321556A1 (en) Reducing amount of data in video encoding
US20230038928A1 (en) Picture partitioning-based coding method and device
US20230300346A1 (en) Supporting view direction based random access of bitsteam
CN112470481A (en) Encoder and method for encoding tile-based immersive video
CN114424552A (en) Low-delay source-channel joint coding method and related equipment
US11736730B2 (en) Systems, methods, and apparatuses for video processing
US20230360277A1 (en) Data processing method and apparatus for immersive media, device and storage medium
RU2795052C2 (en) Methods and device for adaptive point cloud streaming
US20240040169A1 (en) Media file processing method and device therefor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885368

Country of ref document: EP

Kind code of ref document: A1