WO2019024521A1 - 一种图像的处理方法、终端和服务器 - Google Patents

一种图像的处理方法、终端和服务器 Download PDF

Info

Publication number
WO2019024521A1
WO2019024521A1 PCT/CN2018/081177 CN2018081177W WO2019024521A1 WO 2019024521 A1 WO2019024521 A1 WO 2019024521A1 CN 2018081177 W CN2018081177 W CN 2018081177W WO 2019024521 A1 WO2019024521 A1 WO 2019024521A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
image
area
latitude
division
Prior art date
Application number
PCT/CN2018/081177
Other languages
English (en)
French (fr)
Inventor
宋翼
谢清鹏
邸佩云
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to BR112020002235-7A priority Critical patent/BR112020002235A2/pt
Priority to CA3069034A priority patent/CA3069034C/en
Priority to RU2020108306A priority patent/RU2764462C2/ru
Priority to EP18841334.8A priority patent/EP3633993A4/en
Priority to SG11201913824XA priority patent/SG11201913824XA/en
Priority to AU2018311589A priority patent/AU2018311589B2/en
Priority to JP2019571623A priority patent/JP6984841B2/ja
Priority to KR1020207001630A priority patent/KR102357137B1/ko
Publication of WO2019024521A1 publication Critical patent/WO2019024521A1/zh
Priority to PH12020500047A priority patent/PH12020500047A1/en
Priority to US16/739,568 priority patent/US11032571B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/06Topological mapping of higher dimensional structures onto lower dimensional surfaces
    • G06T3/067Reshaping or unfolding 3D tree structures onto 2D planes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/06Topological mapping of higher dimensional structures onto lower dimensional surfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2668Creating a channel for a dedicated end-user group, e.g. insertion of targeted commercials based on end-user profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the present application relates to the field of media standards and media application technologies, and in particular, to an image processing method, a terminal, and a server.
  • 360-degree panoramic video is multi-angle shooting of objects by multiple cameras, supporting multi-angle playback.
  • the image signal can be virtualized as a spherical signal. As shown in FIG. 1, the spherical image signals at different positions in the spherical surface can represent different viewing angle contents.
  • the virtual spherical image signal cannot be visually seen by the human eye, and therefore it is necessary to represent the three-dimensional spherical image signal as a two-dimensional planar image signal, for example, expressed in the form of a latitude and longitude diagram, a cube, or the like.
  • These representations actually map the spherical image signal to a two-dimensional image through some mapping method, making it an image signal that can be visually seen by the human eye.
  • the most commonly used visual image format is a latitude and longitude image.
  • the image is collected by uniformly sampling the spherical image signal according to the longitude angle in the horizontal direction and uniformly sampling according to the latitude angle in the vertical direction. Taking the spherical image signal of the earth as an example, the obtained two-dimensional mapping image is as shown in FIG. 2 . Show.
  • the spherical image signal is a 360-degree panoramic image, while the human eye's viewing angle is typically about 120 degrees, so the effective spherical signal seen by the human eye is about 22% of the panoramic signal.
  • VR terminal devices such as VR glasses
  • VR glasses can support a single viewing angle of between 90 degrees and 110 degrees, which can achieve a better user viewing experience.
  • the image content information in a single viewing angle occupies a small portion of the entire panoramic image when the user views the image, the image information outside the viewing angle is not used by the user, and if all the panoramic images are transmitted, it is unnecessary. The bandwidth is wasted. Therefore, in the encoding and transmission technology of the panoramic video view-based video coding (VDC), the image in the entire video is divided, and the image sub-regions to be transmitted are selected according to the current perspective of the user, thereby achieving Save bandwidth.
  • VDC panoramic video view-based video coding
  • the above-mentioned panoramic video VR encoding transmission technology may include two types: 1) using a Tile-wise encoding transmission method alone; 2) a hybrid image encoding and a Tile-wise encoding transmission method.
  • the Tile-wise code transmission method refers to dividing an image sequence into some image sub-regions, and independently coding all sub-regions to generate single or multiple code streams.
  • the method for evenly dividing the latitude and longitude image includes uniformly dividing the latitude and longitude image into a plurality of tiles in the width and height directions.
  • the client calculates the angle of view according to the position of the user's perspective.
  • the range covered by the image and according to the range, the tile information to be transmitted by the image, including the position and size of the Tile in the image, is requested, and the code stream corresponding to the Tile is requested from the server for transmission, so that the client is The current perspective is rendered and displayed.
  • the sampling rate of the image near the equator is higher, and the sampling rate of the image in the two poles is lower, that is, the image pixel redundancy near the equator is lower, and the image pixel redundancy of the two pole parts is higher. High, and the higher the latitude, the greater the redundancy.
  • the latitude and longitude map is used for uniform division, the pixel redundancy problem of the latitude and longitude map at different latitudes is not taken into account, and the same condition is the same for each image block. Encoding and transmission under resolution, the coding efficiency is low, and the transmission bandwidth is also wasted.
  • the embodiment of the present invention provides an image processing method, a terminal, and a server, which can solve the problem of low coding efficiency and bandwidth waste during code transmission by uniformly dividing the image by using the latitude and longitude image during image sampling.
  • an image processing method which is applied to a server, and includes: a horizontal and vertical division and a vertical division of a warp and a latitude or a spherical image of an image to be processed to obtain respective sub-regions of a warp and a latitude or a spherical image, wherein the horizontal division
  • the division position is a preset latitude
  • the division position of the longitudinal division is determined by the latitude
  • at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions
  • the longitudinal division interval is adjacent longitudinal division The distance between the positions is divided; the obtained images of the respective sub-areas are encoded.
  • this application can be vertically divided according to a plurality of longitudinal division intervals, which can make the sub-regions of the image more The size of the partition is larger, the larger the sub-area is, the coding efficiency is improved when encoding, and the occupied bandwidth when the server transmits the code stream to the terminal after encoding is reduced.
  • the divisional position of the longitudinal division is determined by the latitude including: the higher the latitude at which the longitudinally divided division position is, the larger the vertical division interval. In this way, due to the difference of the latitude of the sub-area, the larger the sub-area with higher latitude, the coarser the division, the better the coding transmission efficiency and the smaller the transmission bandwidth.
  • the method before encoding the obtained images of the respective sub-regions, further comprises: sampling the images of the sub-regions in the horizontal direction according to the first sampling interval; wherein the latitude corresponding to the sub-regions is more High, the first sampling interval is larger; encoding the obtained images of the respective sub-regions includes: encoding the images of the sampled sub-regions. Since the pixel near the equator of the latitude and longitude map has low pixel redundancy, the pixel redundancy of the image of the two poles is high. If each sub-area is encoded and transmitted at the same resolution, the transmission bandwidth is wasted, and the decoding side is redundant.
  • Yu Gao will make the decoding end have high requirements for decoding capability and low decoding speed.
  • the present application can perform horizontal sampling before encoding, and when horizontally sampling, the higher the latitude corresponding to the sub-region, the larger the first sampling interval, that is, the sub-region with high latitude is down-sampled, that is, compressed sampling.
  • the pixel redundancy of the high-latitude sub-area transmitted before encoding can be reduced to achieve the purpose of reducing the bandwidth, and the downsampling reduces the pixel value of the coded transmission, so that the decoding end requires less decoding capability and the decoding is complicated.
  • the degree of decline is such that the decoding speed is improved.
  • the method before encoding the obtained images of the respective sub-regions, the method further comprises: sampling the images of the sub-regions in the longitudinal direction according to the second sampling interval.
  • the second sampling interval may be the same as the interval between the pre-sampling sub-regions, that is, the original sampling is maintained in the vertical direction, or may be smaller than the interval between the pre-sampling sub-regions, that is, down-sampling in the longitudinal direction.
  • the encoding may also be performed. The transmission bandwidth is small, the complexity of decoding at the decoding end is reduced, and the decoding speed is improved.
  • the method further includes: The image of the sub-area is mapped to a two-dimensional planar image according to a preset size; the image of the sub-area is sampled laterally according to the first sampling interval, including: the two-dimensional planar image of the image mapping of the sub-region is in the horizontal direction according to the first sampling interval Sampling.
  • the server collects a spherical image from the shooting device, it can first map the sub-area image of the spherical image to the two-dimensional latitude and longitude image, and then downsample the latitude and longitude image, so that the server directly collects from the shooting device.
  • the method before encoding the image of each sub-region after sampling, the method further comprises: adjusting the position of each sub-region after the sampling, so that the image of the adjusted sub-regions is spliced into an image.
  • the lateral edges are aligned and the longitudinal edges are aligned.
  • the sub-areas can be numbered sequentially in the spliced image, so that the server and the terminal can perform transmission processing on each sub-area according to the number of each sub-area.
  • encoding the image of each sub-region after sampling includes encoding the stitched image tiles.
  • a single code stream can be generated for storage, or the single code stream can be cut to obtain a plurality of sub-areas for storage.
  • the method further comprises: separately coding the code streams corresponding to the images of the respective sub-regions obtained by the encoding, and encoding the respective sub-regions.
  • Position information wherein the encoded position information of all the sub-areas and the code streams of all the sub-areas exist in one track; or the position information and the code stream of each of the encoded sub-areas exist in the respective tracks; or The location information of all the encoded sub-areas exists in the media presentation description (MPD); or the location information of all the sub-areas encoded is present in the private file, and the address of the private file exists in the MPD; or, the encoded The location information of each sub-area exists in the auxiliary enhancement information (SEI) of the code stream of each sub-area.
  • SEI auxiliary enhancement information
  • each sub-region after sampling forms a sampled latitude and longitude map
  • the position information includes the sub-region in the latitude and longitude diagram.
  • the terminal can perform image rendering and rendering when playing the display according to the position and size of the sub-area.
  • the private file further includes information for characterizing the correspondence of the number of the user's viewpoint to the sub-area covered by the perspective of the user's viewpoint.
  • the terminal determines the user view point
  • the sub-area covered by the view angle of the view point can be directly determined according to the corresponding relationship, so that the decoding process is performed according to the code stream of the sub-area, and the decoding speed of the terminal decoding can be improved.
  • the private file further includes information for indicating the number of sub-areas to be preferentially displayed in the sub-area covered by the user's perspective, information of the sub-area number to be preferentially displayed, and sub-area number for the sub-priority display. Information and information on sub-area numbers that are not displayed.
  • the terminal may preferentially acquire the image of the sub-area near the viewpoint for priority display, and discard the priority display. Image data for the sub-area.
  • the latitude and longitude map includes a latitude and longitude map corresponding to the left eye and a latitude and longitude map corresponding to the right eye; before the horizontal and vertical division of the latitude and longitude map or the spherical image of the image to be processed, the method further includes: The latitude and longitude map corresponding to the eye is segmented with the latitude and longitude map corresponding to the right eye; the latitude and longitude map or the spherical map of the image to be processed is divided horizontally and vertically, including: horizontally dividing and vertically dividing the latitude and longitude map corresponding to the left eye, and rightward The latitude and longitude map corresponding to the eye is divided horizontally and vertically. In this way, for the 3D video image, the bandwidth can be reduced and the efficiency of the code transmission can be improved according to the sub-area division mode of the present application.
  • the method further includes: transmitting a code stream corresponding to the image of each sub-area obtained by the encoding to the terminal; or receiving the view information sent by the terminal, and acquiring the sub-area corresponding to the view information according to the view information And transmitting the code stream of the sub-area corresponding to the view information to the terminal; or receiving the number of the sub-area sent by the terminal, and transmitting the code stream corresponding to the number of the sub-area to the terminal.
  • the terminal may obtain the code stream corresponding to the image of the desired sub-area locally, or may send the code stream corresponding to the sub-area to the terminal after the server determines the sub-area according to the view information, or the terminal may determine the required After the number of the sub-area is notified to the server, the server sends the code stream corresponding to the sub-area to the terminal, which can reduce the computing load of the server.
  • the latitude and longitude map is a latitude and longitude map of a 360-degree panoramic video image, or a portion of a latitude and longitude map of a 360-degree panoramic video image; or a spherical map of a 360-degree panoramic video image, or a 360-degree panoramic video Part of the spherical image of the image. That is, the division of the sub-areas in the present application can also be applied to the division of the 180-degree semi-panoramic video image, thereby reducing the bandwidth and improving the coding transmission efficiency of the 180-degree semi-panoramic video image transmission.
  • a method for processing an image is applied to a terminal, including: determining location information of each sub-region of the panoramic image; and determining, according to the determined location information of each sub-region, a sub-region covered by the current perspective in the panoramic image. Position information, and determining a first sampling interval of the sub-area; acquiring a code stream corresponding to the sub-area covered by the current view according to the determined sub-area position information of the current view; decoding the code stream to obtain a sub-area covered by the current view An image of the region; the decoded image is resampled according to the determined position information of the sub-area covered by the current view and the first sampling interval, and the resampled image is played.
  • the sampling interval may be changed according to the position of the sub-region, and the sub-region is uniformly divided as in the prior art, and the image is decoded according to the predetermined sampling interval during decoding.
  • the terminal of the present application may re-sample the image according to different sampling intervals. , can improve the display speed of the decoder image.
  • determining location information of each sub-area of the panoramic image includes: receiving first information sent by the server, where the first information includes a trajectory of each sub-area of the panoramic image and a code stream of each sub-area, the trajectory includes a panorama Position information of all sub-areas of the image; according to the trajectory, position information of each sub-area in the panoramic image is obtained.
  • determining location information of each sub-area in the panoramic image includes: receiving a media presentation description (MPD) sent by the server, where the MPD includes location information of each sub-area, or the MPD includes a private file.
  • the address, and the private file includes location information of each sub-area; parsing the MPD to obtain location information of each sub-area.
  • MPD media presentation description
  • the location information of the sub-region exists in the auxiliary enhancement information (SEI) of the code stream corresponding to the sub-region.
  • SEI auxiliary enhancement information
  • the code stream corresponding to the sub-area covered by the current view is obtained, including: acquiring a code stream corresponding to the sub-area covered by the current view from the memory of the terminal; or requesting the server to obtain the sub-area covered by the current view.
  • the code stream corresponding to the area is obtained, including: acquiring a code stream corresponding to the sub-area covered by the current view from the memory of the terminal; or requesting the server to obtain the sub-area covered by the current view.
  • the requesting, by the server, the code stream corresponding to the sub-area covered by the current view includes: transmitting information indicating the current view to the server, and receiving a code stream corresponding to the sub-area covered by the current view sent by the server; Or, according to the protocol preset by the terminal and the server, the code stream corresponding to the sub-area covered by the current view is obtained from the server, and the protocol includes a correspondence between the view and the sub-area covered by the view, so that the terminal can obtain the service from the service according to the corresponding relationship. The speed of the code stream corresponding to the sub-area.
  • determining the first sampling interval of the sub-area includes: determining a preset sampling interval as a first sampling interval; or receiving a first sampling interval from a server; or, according to each received from a server
  • the position information of the sub-areas acquires the first sampling interval, that is, when the position information of each sub-area is different, the corresponding first sampling interval may also be different.
  • a third aspect provides a method for processing an image, which is applied to a server, including: a code stream corresponding to an image of each sub-region of a latitude and longitude map or a spherical image of a panoramic image, the sub-region according to a latitude and longitude map or a spherical map of the panoramic image
  • the horizontal division and the longitudinal division are obtained, wherein the division position of the horizontal division is a preset latitude, the division position of the vertical division is determined by the latitude, and at least two vertical divisions exist in the area formed by the division positions of the adjacent lateral divisions.
  • the interval is a distance between the divided positions of the adjacent longitudinal partitions; and the code stream of the sub-area covered by the current view in the code stream corresponding to the saved image of each of the saved sub-regions requested by the terminal is transmitted to the terminal.
  • the code stream corresponding to the image of each sub-area saved by the server is transmitted to the terminal, the manner of vertically dividing the sub-area according to at least two vertical division intervals between different latitudes of the present application can be avoided as in the prior art.
  • the present application can be vertically divided according to a plurality of vertical division intervals, which can make the sub-regions of the image have multiple sizes, the larger the division interval, the larger the sub-region, the coding efficiency during encoding is improved, and the coded server is improved.
  • the occupied bandwidth when transmitting the code stream to the terminal is reduced.
  • the image corresponding to the sub-area stored in the server is sampled in the horizontal direction according to the first sampling interval before being encoded; wherein the higher the latitude corresponding to the sub-area, the larger the first sampling interval Or, it is sampled in the longitudinal direction at the second sampling interval.
  • a server including: a dividing unit, configured to perform horizontal division and vertical division of a warp and a weft or a spherical image of an image to be processed to obtain respective sub-regions of a warp and a weft or a spherical image, wherein the division of the horizontal division
  • the position is a preset latitude
  • the division position of the longitudinal division is determined by the latitude
  • at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions
  • the longitudinal division interval is the division position of the adjacent longitudinal division.
  • the distance between the coding units is used to encode the obtained images of the respective sub-regions.
  • the divisional position of the longitudinal division is determined by the latitude including: the higher the latitude at which the longitudinally divided division position is, the larger the vertical division interval.
  • the sampling unit is further configured to: sample the image of the sub-area in the horizontal direction according to the first sampling interval; wherein, the higher the latitude corresponding to the sub-area, the larger the first sampling interval;
  • the coding unit is configured to: encode the image of each sub-region after sampling.
  • the sampling unit is further configured to: sample the image of the sub-area in the longitudinal direction according to the second sampling interval.
  • the sampling unit is further configured to: map the image of the sub-region to a two-dimensional planar image according to a preset size; and the sampling unit is configured to: perform a first-dimensional image of the two-dimensional image of the image of the sub-region according to the first The sampling interval is sampled in the lateral direction.
  • the splicing unit is further configured to: adjust the position of each sub-region after sampling, so that the adjusted lateral edges of the images of the sub-regions are aligned, and the longitudinal edges are aligned.
  • the coding unit is used to: encode the stitched image tiles.
  • the method further includes: a packaging unit, configured to: independently code a code stream corresponding to an image of each sub-region obtained by encoding, and encode location information of each sub-region; wherein, all the encoded sub-regions The location information of the area and the code stream of all the sub-areas exist in one track; or the position information and the code stream of each of the encoded sub-areas exist in respective tracks; or the position information of all the sub-areas after the coding exists
  • the area's code stream is in the auxiliary enhancement information (SEI).
  • SEI auxiliary enhancement information
  • each sub-region after sampling forms a sampled latitude and longitude map
  • the position information includes the sub-region in the latitude and longitude diagram.
  • the position information includes a position of the sub-region in the spherical image and a latitude and longitude range, and the sub-region is in the spliced image Location and size.
  • the private file further includes information for characterizing the correspondence of the number of the user's viewpoint to the sub-area covered by the perspective of the user's viewpoint.
  • the private file further includes information for indicating the number of sub-areas to be preferentially displayed in the sub-area covered by the user's perspective, information of the sub-area number to be preferentially displayed, and sub-area number for the sub-priority display. Information and information on sub-area numbers that are not displayed.
  • the latitude and longitude map includes a latitude and longitude map corresponding to the left eye and a latitude and longitude map corresponding to the right eye; and a dividing unit is configured to: divide the latitude and longitude map corresponding to the left eye and the latitude and longitude map corresponding to the right eye; It is used for: horizontally dividing and longitudinally dividing the warp and weft diagram corresponding to the left eye, and horizontally dividing and vertically dividing the warp and weft diagram corresponding to the right eye.
  • the transmission unit is further configured to: send the code stream corresponding to the image of each sub-area obtained by the encoding to the terminal; or receive the view information sent by the terminal, and obtain the view information corresponding to the view information.
  • the sub-area transmits the code stream of the sub-area corresponding to the view information to the terminal; or receives the number of the sub-area sent by the terminal, and sends the code stream corresponding to the number of the sub-area to the terminal.
  • the latitude and longitude map is a latitude and longitude map of a 360-degree panoramic video image, or a portion of a latitude and longitude map of a 360-degree panoramic video image; or a spherical map of a 360-degree panoramic video image, or a 360-degree panoramic video Part of the spherical image of the image.
  • the fifth aspect provides a terminal, including: an acquiring unit, configured to determine location information of each sub-area of the panoramic image; the acquiring unit is further configured to: determine, according to the determined location information of each sub-area, the sub-area covered by the current perspective Position information in the panoramic image, and determining a first sampling interval of the sub-area; the acquiring unit is further configured to: acquire the code stream corresponding to the sub-area covered by the current viewing angle according to the determined sub-area position information of the current viewing angle; And used for decoding the code stream to obtain an image of the sub-area covered by the current view; a resampling unit, configured to resample the decoded image according to the determined position information of the sub-area covered by the current view and the first sampling interval ; playback unit for re-sampling images for playback.
  • an acquiring unit configured to determine location information of each sub-area of the panoramic image
  • the acquiring unit is further configured to: determine, according to the determined location information of each sub-area, the sub-
  • the acquiring is used for: receiving, by the server, first information, where the first information includes a trajectory of each sub-area of the panoramic image and a code stream of each sub-area, where the trajectory includes positions of all sub-areas of the panoramic image.
  • the information acquisition unit is further configured to obtain location information of each sub-area in the panoramic image according to the trajectory.
  • the acquiring is used to: receive a media presentation description (MPD) sent by the server, where the MPD includes location information of each sub-area, or the address of the private file is included in the MPD, and the private file includes each sub- Location information of the area; parsing the MPD to obtain location information of each sub-area.
  • MPD media presentation description
  • the location information of the sub-region exists in the auxiliary enhancement information (SEI) of the code stream corresponding to the sub-region.
  • SEI auxiliary enhancement information
  • the acquiring unit is configured to: obtain a code stream corresponding to the sub-area covered by the current view from the memory of the terminal; or request the server to obtain the code stream corresponding to the sub-area covered by the current view.
  • the acquiring unit is configured to: send information indicating the current perspective to the server, and receive a code stream corresponding to the sub-area covered by the current perspective sent by the server; or, according to a protocol preset by the terminal and the server, The server obtains a code stream corresponding to the sub-area covered by the current view, and the protocol includes a correspondence between the view angle and the sub-area covered by the view.
  • the obtaining unit is configured to: determine that the preset sampling interval is the first sampling interval; or receive the first sampling interval from the server.
  • a server including: a storage unit, configured to save a code stream corresponding to an image of each sub-area of a panoramic image or a sub-area of the spherical image, and the sub-area is horizontally according to the latitude and longitude map or the spherical image of the panoramic image.
  • the division and the longitudinal division are obtained, wherein the division position of the horizontal division is a preset latitude, the division position of the vertical division is determined by the latitude, and at least two longitudinal division intervals exist in the area formed by the adjacent division positions of the lateral division, The longitudinal division interval is a distance between the division positions of the adjacent longitudinal divisions; and the transmission unit is configured to send, to the terminal, a code stream of the sub-region covered by the current perspective in the code stream corresponding to the saved image of each saved sub-region requested by the terminal.
  • the image corresponding to the sub-area stored in the server is sampled in the horizontal direction according to the first sampling interval before being encoded; wherein the higher the latitude corresponding to the sub-area, the larger the first sampling interval Or, it is sampled in the longitudinal direction at the second sampling interval. That is to say, sub-sampling of the high-latitude sub-regions in the horizontal direction, that is, compression sampling, can reduce the image pixel redundancy of the high-latitude sub-regions transmitted before encoding, thereby reducing the bandwidth and simultaneously down-sampling.
  • the pixel value required for the coded transmission is reduced, so that the decoding end has lower requirements on the decoding capability, and the decoding complexity is lowered, so that the decoding speed is improved.
  • an embodiment of the present application provides a computer storage medium for storing computer software instructions used by the server, including a program designed to perform the above aspects.
  • an embodiment of the present application provides a computer storage medium for storing computer software instructions used by the terminal, including a program designed to perform the above aspects.
  • embodiments of the present application provide a computer program product comprising instructions that, when run on a computer, cause the computer to perform the methods of the above aspects.
  • An embodiment of the present application provides a method, a terminal, and a server for processing an image, including: dividing a latitude and longitude image or a spherical surface of a to-be-processed image into a horizontal division and a longitudinal division to obtain respective sub-regions of the warp and latitude diagram or the spherical plane, wherein the horizontal division
  • the division position is a preset latitude
  • the division position of the longitudinal division is determined by the latitude
  • at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions
  • the longitudinal division interval is adjacent longitudinal division
  • the distance between the positions is divided; the obtained images of the respective sub-areas are encoded.
  • this application can be vertically divided according to a plurality of longitudinal division intervals, which can make the sub-regions of the image more The size of the partition is larger, the larger the sub-area is, the coding efficiency is improved when encoding, and the occupied bandwidth when the server transmits the code stream to the terminal after encoding is reduced.
  • FIG. 1 is a schematic diagram of a 360-degree panoramic image signal according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a schematic diagram of a 360-degree panoramic image signal converted into a latitude and longitude diagram according to an embodiment of the present application;
  • FIG. 3 is a schematic diagram of a network architecture according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of a method for processing an image according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a latitude and longitude map divided into 42 sub-areas according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a latitude and longitude map divided into 50 sub-areas according to an embodiment of the present application
  • FIG. 7 is a schematic flowchart diagram of a method for processing an image according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a view area in a latitude and longitude map according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a sub-area covered by a viewing angle according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic flowchart diagram of a method for processing an image according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of a terminal decoding display process according to an embodiment of the present application.
  • FIG. 12 is a schematic diagram of partitioning a sub-area of a 3D latitude and longitude map according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a horizontal division of a warp and a weft diagram of a 180° half panoramic video according to an embodiment of the present application;
  • FIG. 14 is a schematic diagram of a sub-area division manner of a 3D 180° half panoramic video according to an embodiment of the present application.
  • FIG. 15 is a schematic flowchart diagram of a method for processing an image according to an embodiment of the present disclosure.
  • 16 is a schematic diagram of a method for dividing and obtaining an image sub-area in a spherical panoramic signal according to an embodiment of the present disclosure
  • FIG. 17 is a schematic flowchart of a method for processing an image according to an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 20 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • FIG. 21 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 22 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 23 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 24 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 25 is a schematic structural diagram of a server according to an embodiment of the present application.
  • Panoramic video refers to VR panoramic video, also known as 360-degree panoramic video or 360 video. It is a video that uses multiple cameras to shoot 360 degrees in all directions. When watching video, users can adjust the video up and down. Watch it.
  • 3D panoramic video refers to VR panoramic video in 3D format.
  • the video includes two 360-degree panoramic videos, one for left-eye display and one for right-eye display. Two videos are displayed for left and right eyes in the same frame.
  • the content is slightly different, allowing users to have a 3D effect while watching.
  • the latitude and longitude map the Equirectangular Projection (ERP).
  • ERP Equirectangular Projection
  • One of the panoramic image formats which is a two-dimensional panoramic image that can be used for saving and transmission, which is obtained by uniformly sampling the spherical signals in equal intervals and equal latitude intervals.
  • the horizontal and vertical coordinates of the image can be expressed by latitude and longitude, and can be expressed by longitude in the width direction, and the span is 360°; the height can be represented by latitude, and the span is 180°.
  • Video decoding The process of restoring a video stream to a reconstructed image according to specific grammar rules and processing methods.
  • Video encoding The process of compressing a sequence of images into a stream of code.
  • Video coding A general term for video encoding and video decoding.
  • the Chinese translation is the same as video encoding.
  • Tile refers to the video coding standard, which is a block-shaped coding region obtained by dividing the image to be encoded in High Efficiency Video Coding (HEVC).
  • HEVC High Efficiency Video Coding
  • One frame of image can be divided into multiple tiles, and multiple tiles together constitute the tile.
  • Frame image Each tile can be encoded independently.
  • the Tile in the present application may be a Tile to which motion-constrained tile sets (MCTS) technology is applied.
  • MCTS motion-constrained tile sets
  • MCTS Motion-restricted Tile set is a coding technique for Tile, which limits the motion vector inside Tile during encoding, so that the Tile in the same position in the image sequence does not refer to the Tile area in the time domain. Image pixels outside the location, so each tile in the time domain can be decoded independently.
  • Sub-picture The entire image is divided to obtain a part of the original image called a sub-image of the image.
  • the sub-images in this application may be sub-images that are square in shape.
  • Image sub-area The image sub-area in this application can be used as a general term for a Tile or a sub-image, and can be simply referred to as a sub-area.
  • VDC View-based video coding, which is a coding transmission technology for panoramic video coding, that is, a method of encoding transmission based on a viewing angle viewed by a user.
  • Tile-wise coding A method of video coding, which divides an image sequence into multiple image sub-regions, and separately encodes all sub-regions to generate a single or multiple code streams.
  • the Tile-wise encoding in this application may be a Tile-wise encoding in a VDC.
  • Track can be translated as "trajectory”, which refers to a series of samples with time attributes in accordance with the International Standardization Organization (ISO) basic media file format (ISOBMFF) package, such as video track
  • ISO International Standardization Organization
  • ISOBMFF basic media file format
  • Box can be translated as a "box", in the standard refers to an object-oriented building block, defined by a unique type identifier and length. In some specifications it can be called “atoms", including the first definition of MP4.
  • the box is the basic unit that makes up the ISOBMFF file, and the box can contain other boxes.
  • Supplementary enhancement information is a type of Network Abstract Layer Unit (NALU) defined in the video codec standard (h.264, h.265).
  • NALU Network Abstract Layer Unit
  • MPD A document specified in the standard ISO/IEC 23009-1, in which the client constructs a HyperText Transfer Protocol (HTTP) - Uniform Resource Locator (URL). Metadata. Include one or more period elements in the MPD, each period element contains one or more adaptation sets, each adaptation set contains one or more representations, each representation contains One or more segments, the client selects an expression based on the information in the MPD and constructs a segmented HTTP-URL.
  • HTTP HyperText Transfer Protocol
  • URL Uniform Resource Locator
  • ISO basic media file format is composed of a series of boxes, which can contain other boxes in the box.
  • the metadata box and the media data box are included in the box, the metadata box (moov box) contains the metadata, and the media data box (mdat box) contains the media data.
  • the box of metadata and the box of media data can be in the same file or in separate files.
  • the embodiments of the present application can be used for processing before panoramic video or partial panoramic video encoding, and for encapsulating the encoded code stream, and corresponding operations and processes are involved in both the server and the terminal.
  • the network architecture of the present application may include a server 31 and a terminal 32. Also communicating with the server 31 is a photographing device 33 that can be used to take a 360 degree panoramic video and transmit the video to the server 31.
  • the server may perform pre-coding processing on the panoramic video, and then perform encoding or transcoding operations, and then encapsulate the encoded code stream into a transmittable file, and transmit the file to the terminal or the content distribution network.
  • the server may also select the content to be transmitted for signal transmission according to information fed back by the terminal (for example, a user perspective or the like).
  • the terminal 32 can be an electronic device such as a VR glasses, a mobile phone, a tablet computer, a television, and a computer that can be connected to a network.
  • the terminal 32 can receive the data transmitted by the server 31, and perform code stream decapsulation, decoding display, and the like.
  • the image is processed by using the latitude and longitude image to uniformly divide the image, resulting in waste of bandwidth of the encoding transmission, decoding capability of the decoding end, and speed limitation.
  • the image processing method may be provided, and the method may be based on multiple image sub- The latitude and longitude map of the region is divided and processed, and the corresponding encoding transmission and decoding presentation modes.
  • the latitude and longitude of the latitude and longitude diagram is in the range of 0 to 360 degrees
  • the longitudinal latitude is in the range of -90 to 90 degrees. Negative degrees indicate south latitude, and positive degrees indicate north latitude.
  • the method may include:
  • the latitude and longitude map of the image to be processed of the server is horizontally divided, and the divided position of the horizontal division is a preset latitude.
  • the image can be a plurality of sequence images of the video.
  • the latitude and longitude map of the video obtained by the server according to the video captured by the photographing device is taken as an example in (a) of FIG. 5, and the latitude of the server in the longitudinal direction of the warp and weft diagram is -60°, -30°, 0°, 30°, and The latitude and longitude line at 60° divides the latitude and longitude picture laterally.
  • the latitude value is represented by X
  • the latitude and longitude of the latitude and longitude map is 0°, and between 90° north latitude and 90° south latitude.
  • the latitude and longitude map is divided into 30° and 60° in the north latitude, and the warp and weft map is divided horizontally in the south latitudes of -60° and -30°, and the horizontal division interval is 30°.
  • the division interval can also be understood as the division step size.
  • the server processes the latitude and longitude image of the image to be vertically divided, and the division position of the longitudinal division is determined by the latitude, and at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions, and the longitudinal division interval is the adjacent vertical direction.
  • the distance between the divided positions is divided to obtain each sub-region of the warp and latitude chart.
  • the longitudinal division interval between different latitudes may be different in the south latitude and longitude portions of the warp and weft diagram, and the longitudinal division interval between the south latitude portion and the corresponding latitude of the north latitude portion may be the same.
  • the sub-image of the latitude range of the south latitude portion -90° to -60° and the latitude range of the north latitude portion of 60° to 90° may be vertically spaced by the longitude of 120°, longitudinally Dividing the sub-image to obtain 3 sub-regions; for sub-images with a latitude range of -60° to -30° and between 30° and 60°, the longitudinal division of the sub-image is divided into six sub-regions with a longitude of 60°; The sub-images with a latitude range of -30° to 0° and between 0° and 30° are vertically divided by a longitude of 30°, and the sub-images are vertically divided to obtain 12 sub-regions.
  • the longitudinal division interval includes a longitude of 120°, a longitude of 60°, and a longitude of 30°.
  • the manner of dividing the sub-images in the above manner is different, and the warp and weft map can be divided into 50 sub-regions.
  • the warp and weft map can be divided into 50 sub-regions.
  • no longitudinal division is performed, maintaining a single sub-area; for latitude ranges between -60° and -30° And a sub-image between 30° and 60°, with a longitude interval of 30°, longitudinally dividing the sub-image to obtain 12 sub-regions; for a latitude range between -30° and 0° and between 0° and 30°
  • the sub-images are divided into longitudinal intervals by a longitude of 30°, and the sub-images are vertically divided to obtain 12 sub-regions.
  • the division step includes a longitude of 30° and a longitude of 0°.
  • the division step length is 0°, which means that the sub-image is not vertically divided.
  • the server encodes the obtained image of each sub-area.
  • the present application can avoid the detailed division of features in the prior art, and can vertically divide the warp and latitude diagram according to a plurality of longitudinal division intervals, which can make the sub-area divided into multiple sizes, and the larger the vertical division interval, the larger the sub-area, for example,
  • the latitude of the vertical division is higher, the vertical division interval is larger, and the sub-region is larger, the coding efficiency at the time of encoding is improved, and the bandwidth occupied by the server when transmitting the code stream to the terminal is reduced.
  • the terminal obtains more redundant pixels, so that the terminal needs to increase the maximum decoding capability, and the decoding speed is also increased.
  • the present application can also perform de-redundancy processing on the pixels of the sub-area after non-uniform division, that is, downsampling, so that the pixel points required for encoding transmission are reduced, and the maximum required by the decoding end is required.
  • the decoding capability is reduced, which can reduce the decoding complexity and improve the decoding speed of the decoder. Therefore, as shown in FIG. 7, before the step 403, the implementation method of the present application may further include:
  • the image of the sub-area of the server is originally sampled in the vertical direction, or the image of the sub-area is sampled in the longitudinal direction according to the second sampling interval.
  • the original sampling can be understood as the image longitudinal direction of each sub-area remains unchanged, without scaling. Processed or not processed.
  • Sampling according to the second sampling interval for example, downsampling the entire sub-regions in the longitudinal direction, can also be understood as sampling in the longitudinal direction according to the height of the given sub-region.
  • the image of the sub-area of the server is sampled in the horizontal direction according to the first sampling interval.
  • the first sampling interval and the second sampling interval may be preset on the server side, and the first sampling interval and the second sampling interval may be the same or different.
  • the first sampling interval can be understood as the reciprocal of the scaling factor, that is, one pixel point is sampled per multiple pixel points to obtain a scaled image.
  • the horizontal sampling is performed, and the first sampling interval is performed.
  • the scaling factor is 1/4; for sub-images with latitude range -60° to -30° and latitude range 30° to 60°, lateral downsampling is also performed.
  • the scaling factor is 1/2; for sub-images with a latitude range of -30° to 0° and between 0° and 30°, no lateral scaling is performed.
  • the resulting sampled image is shown in (b) of FIG. It should be noted that (b) in FIG.
  • the first sampling interval is proportional to the latitude when laterally sampling, that is, for the north latitude image portion, the higher the latitude corresponding to the sub-region, the larger the first sampling interval, and likewise, for the southern latitude image portion, the latitude The higher the first sampling interval is.
  • the south latitude image portion and the north latitude image portion have the same sampling interval corresponding to the same latitude.
  • the size of the sub-area divided and scaled by the warp and weft diagram is different from the schematic diagram of the down-sampling in the longitudinal direction of (b) in FIG.
  • the latitudes may also be non-uniform, so that the size of the scaled sub-regions in (b) of FIG. 5 may be broken to the same size, so that the code transmission efficiency of the server during encoding transmission is improved.
  • the zoomed sub-area may be re-arranged and combined.
  • the method may further include:
  • the server adjusts the positions of the sampled sub-areas such that the adjusted lateral edges of the images of the images of the sub-areas are aligned, and the longitudinal edges are aligned.
  • the image after the position adjustment can be as shown in (c) of FIG.
  • Step 403 can be replaced by:
  • the server encodes the image of each sub-region after sampling.
  • Sub-areas are encoded.
  • Two encoding modes can be included here: (1) Sub-picture coding mode, that is, each sub-picture sequence is independently coded to generate 42 sub-code streams, that is, each sub-picture corresponds to one code stream.
  • the sub-image may be the sub-region, that is, the 42 sub-regions are separately coded separately to obtain a code stream corresponding to each sub-region; (2) the entire image is coded in a tile mode, and the MCTS may be used in coding.
  • the technique is to generate a single stream of the entire image for storage, or to cut a single stream to obtain 42 substreams for storage.
  • the whole image here may be an image obtained by sampling and scaling the source latitude and longitude image, as shown in (b) of FIG. 5, or may be a regular image after recombining the sampled and scaled image, as shown in FIG. 5 ( c).
  • the server After the image is encoded, the server also needs to encapsulate the code stream of each sub-region obtained after the encoding. Therefore, the method may further include:
  • the server separately encapsulates the code streams corresponding to the images of the respective sub-areas obtained by the encoding, and encodes the location information of each sub-area.
  • the server can separately package the code streams of all sub-areas in one track, that is, track, for example, in a tile track, or in a corresponding track.
  • the location information of the sub-area can be understood as the description information of the sub-area division manner, and the location information of all sub-areas encoded and the code stream of all sub-areas may exist in one track; or the location information of each sub-area after coding And the code stream exists in the respective track; or the location information of all the encoded sub-areas exists in the media presentation description (MPD); or the location information of all the sub-areas after the coding may exist in the private file, and is private The address of the file exists in the MPD; or, the location information of each sub-area after encoding exists in the auxiliary enhancement information (SEI) of the code stream of each sub-area.
  • SEI auxiliary enhancement information
  • each sub-region after sampling forms a sampled latitude and longitude map
  • the position information includes the position and size of the sub-region in the latitude and longitude map, and the sub-region.
  • the position and size in the sampled latitude and longitude map; or, the position information includes the position and size of the sub-area in the latitude and longitude map, and the position and size of the sub-area in the spliced image.
  • the size can include width and height.
  • the location information of all the sub-areas is saved in a track, and the description information of all sub-area division manners may be added in the track of the spliced image, for example, the moov box in the track of the spliced image is added.
  • RectRegionPacking(i): indicates that the division information of the i-th sub-area is described.
  • Proj_reg_width[i], proj_reg_height[i] describes the corresponding width and height of the i-th sub-region in the sampled image in the source image, ie, the latitude and longitude map before unsampling (for example, (a) in 5), for example, description
  • the sub-region in (b) of FIG. 5 corresponds to the width and height in (a) of FIG. 5, for example, for the width x height, that is, the 3840x1920 latitude and longitude diagram, and the upper left corner of the (b) of FIG.
  • the width and height of a sub-area in the source image are (1280, 320).
  • Proj_reg_top[i], proj_reg_left[i] describes the corresponding position of the upper left pixel of the i-th sub-region of the sampled image in the source image, for example, the upper left point of the sub-region in (b) of FIG. 5 is described.
  • the corresponding position in (a) of FIG. 5, such as the position of the first sub-area in the upper left corner in (b) of FIG. 5, in the source image is (0, 0). This position is obtained with the (0,0) coordinates of the upper left corner of the source image.
  • Transform_type[i] Describes the transformation of the i-th sub-region from the corresponding source image position in the sampled image, for example, the i-th sub-region is the corresponding region in the source image after untransformed/90-degree rotation/180-degree rotation/270 degrees Rotate/horizontal mirroring/horizontal mirroring after 90 degree rotation/horizontal mirroring after 180 degree rotation/horizontal mirroring after 270 degree rotation.
  • Packed_reg_width[i], packed_reg_height[i] describes the width and height of the i-th sub-area in the regular image after combining in the sampled image, that is, the width and height of the sub-area in (c) of FIG.
  • the width and height of the first sub-area in the upper left corner in (b) of FIG. 5 in the regular image after combination are (320, 320).
  • the image after the sub-region combination is (b) in FIG. 5, and the width and height refers to the width and height in (b) in the image 5.
  • Packed_reg_top[i], packed_reg_left[i] describes the relative position of the upper left pixel of the i-th sub-region in the sampled image in the regular image after the sub-region combination, that is, each of (c) in FIG. The top left position of the subarea. It should be noted that, when the above-described step 406 is not performed, the image after the sub-region combination is (b) in FIG. 5, and the position refers to the position in (b) in the image 5.
  • SubPictureCompositionBox extends TrackGroupTypeBox('spco') ⁇
  • Track_x, track_y describes the position of the upper left pixel of the sub-area of the current track in the regular image after the sub-area combination, that is, the upper left point position of the current sub-area in (c) of FIG.
  • Track_width, track_height describes the width and height of the sub-region of the current track in the regular image after the sub-region combination, that is, the width and height of the current sub-region in (c) of FIG. 5;
  • composition_width, composition_height describes the width and height of the regular image after the sub-region combination, that is, the image width and height in (c) of FIG.
  • Proj_tile_x, proj_tile_y describes the position of the upper left pixel of the sub-area of the current track in the source image, that is, the upper left point position of the current sub-area in (a) of FIG.
  • Proj_tile_width, proj_tile_height describes the width and height of the sub-area of the current track in the source image, that is, the width and height of the current sub-area in (a) of FIG. 5;
  • Proj_width, proj_height describes the source image width and height, that is, the image width and height in (a) of Figure 5.
  • Manner 3 The location information of all sub-areas is stored in the MPD, that is, the manner in which the sub-areas are divided in the MPD.
  • the syntax in MPD can be:
  • the sub-region is used to describe the position of the image in the corresponding code stream in the source video image by means of a 2D image.
  • a 0 indicates the source identifier, and the source identifier value is the same as the homologous;
  • "0, 30, 0" indicates the coordinates of the center point of the region corresponding to the sub-region on the spherical surface (yaw angle, pitch angle, rotation angle); 120, 30" indicates the width and height of the sub-area.
  • Method 4 The location information of all sub-areas is saved in the private file, and the address of the private file is saved in the MPD. That is, the address of the private file storing the description information of the sub-area is written into the MPD by means of specifying the file link in the MPD.
  • the syntax can be as follows:
  • the partition information of the subregion is saved as a private file tile_info.dat.
  • the data of the sub-area division information saved in the file may be specified by the user, and is not limited herein.
  • the saved content may be in one of the following ways:
  • Tile_num indicates the number of divided sub-areas.
  • Pic_width indicates the width of the source image, that is, the width of the image in (a) in Fig. 5.
  • Pic_height indicates the height of the source image, that is, the height of the image in (a) in Fig. 5.
  • Comp_width indicates the width of the regular image after the sub-region combination, that is, the width of the image in (c) in FIG.
  • Comp_height indicates the height of the regular image after the sub-region combination, that is, the height of the image in (c) in FIG. 5.
  • Tile_pic_width[] An array representing the width of each sub-area in the source image. The number of elements should be the tile_num value.
  • Tile_pic_height[] An array indicating the height of each sub-area in the source image. The number of elements should be the tile_num value.
  • Tile_comp_width[] An array indicating the width of each sub-region in the regular image after the sub-region is combined. The number of elements should be the tile_num value.
  • Tile_comp_height[] An array indicating the height of each sub-region in the regular image after the sub-region is combined. The number of elements should be the tile_num value.
  • the Uniform Resource Locator (URL) of the private file is written into the MPD by specifying a new EssentialProperty attribute Tile@value.
  • the Tile@value attribute description can be as shown in Table 1.
  • Tile@value Description Information Specifying information of tiles
  • Manner 5 The location information of each sub-area is stored in the auxiliary enhancement information SEI of the code stream of each sub-area. That is, the division manner of the sub-areas is transmitted by writing the position information of the sub-areas to the SEI of the code stream.
  • One of the setting information of the division information SEI syntax elements in the image based on the sub-regions can be as shown in Table 2.
  • Table 2 SEI syntax elements based on sub-area partitioning information
  • Src_pic_width represents the source image width, which is the image width in Figure 5(a).
  • Src_pic_height indicates that the source image is high, that is, the image height in Figure 5(a).
  • Src_tile_x represents the horizontal coordinate of the upper left corner of the current sub-area on the source image, that is, the abscissa of the current sub-area in (a) of FIG.
  • Src_tile_y represents the longitudinal coordinate of the upper left corner of the current sub-region on the source image, that is, the ordinate of the current sub-region in (a) of FIG.
  • Src_tile_width indicates the width of the current subregion on the source image.
  • Src_tile_height indicates the height of the current sub-area on the source image.
  • the packed_pic_width indicates the width of the regular image after the sub-region is combined, that is, the image width in Fig. 5(c).
  • the packed_pic_height indicates the height of the regular image after the sub-region combination, that is, the image height in Fig. 5(c).
  • the packed_tile_x indicates the horizontal coordinate of the upper left corner of the current sub-area on the combined regular image, that is, the abscissa of the current sub-area in FIG. 5(c).
  • the packed_tile_y represents the longitudinal coordinate of the upper left corner of the current sub-area on the combined regular image, that is, the ordinate of the current sub-area in FIG. 5(c).
  • Packed_tile_width indicates the width of the current sub-area on the combined regular image.
  • Packed_tile_height indicates the height of the current sub-area on the combined regular image.
  • the present application can also extend the above manner four, and in the MPD, the URL of the private file storing the location information of the sub-area can also be specified by the new element.
  • Expansion method 4 The private file address storing the sub-area division information is written into the MPD by means of the specified file link in the MPD.
  • the syntax can be:
  • the location information of the sub-area is saved as the private file tile_info.dat, the syntax element ⁇ UserdataList> (see Table 3) is added, and the UserdataURL element is included, and the private file is written into the MPD.
  • the private file is obtained by parsing ⁇ UserdataList> to obtain information such as the division manner and location of the sub-area.
  • the description information of the sub-area division manner in the foregoing manner 4 may also be extended, and the extension adds a relationship table between the user perspective and the required sub-area for the content in the transmitted private file tile_info.dat, so that the terminal can be faster.
  • the saved content can be in one of the following ways:
  • the new data in the fourth mode is deg_step_latitude, deg_step_longitude, view_tile_num, viewport_table[][], the meaning of the data is as follows:
  • Deg_step_latitude The step size of the viewpoint area divided in the latitude direction, which divides the latitude range from -90° to 90° into multiple viewpoint areas.
  • the view area refers to the range of the area to which a certain view belongs on the latitude and longitude map. In the same view point area, the image sub-area code stream obtained by the terminal covering the view area is the same.
  • the entire latitude and longitude map is divided into nine viewpoint regions, and both the viewpoint 1 and the viewpoint 2 belong to the fifth viewpoint region, and the viewpoint of the center of the viewpoint region 5 is indicated in FIG. For all viewpoints within the range of the viewpoint area 5, the corresponding angle of view coverage will be calculated as the range covered by the corresponding viewpoint of the center viewpoint.
  • Deg_step_longitude The step size of the viewpoint area divided in the longitude direction, which divides the longitude range from 0° to 360° into multiple viewpoint areas. Deg_step_latitude and deg_step_longitude together determine the number of viewpoint regions.
  • View_tile_num The maximum number of sub-areas that can be covered when a single perspective changes.
  • Viewport_table[][] An array for storing the relationship between a view area and the image sub-area number of the view area.
  • the total number of data in the table should be the number of view areas multiplied by view_tile_num.
  • the 18 numbers in each row of the data table represent the sub-region numbers corresponding to the coverage of a certain viewpoint.
  • the number 0 indicates that the perspective can be overwritten by less than 18 sub-regions, and the vacant value is filled with 0, as shown in FIG.
  • the viewpoint is located at a viewing angle of 0° longitude of 150°, and the corresponding sub-areas are sub-regions numbered 5, 6, 7, 13, 14, 15, 16, 25, 26, 27, 28, 35, 36, 37.
  • the values in the data table are represented as 5,6,7,13,14,15,16,25,26,27,28,35,36,37,0,0,0.
  • the terminal after obtaining the values, the terminal only needs to find the sub-area number in the corresponding table according to the current view point, and can directly decode the sub-area code stream corresponding to the number according to the corresponding relationship without performing calculation, thereby speeding up the processing of the terminal. speed.
  • the above-mentioned private file includes the corresponding relationship between the number of the user's viewpoint and the number of the sub-area covered by the perspective of the user's viewpoint.
  • the embodiment of the present application may further add the marker data for the optimized presentation of the perspective in the private file tile_info.dat; correspondingly, The arrangement of the data in the data table viewport_table[][] may appear in an optimized form, that is, the closer to the sub-area of the current viewpoint, the number of the sub-area appears in the front position of the corresponding line of the current viewpoint.
  • the private file further includes information for indicating the number of sub-areas to be preferentially displayed in the sub-area covered by the user's perspective, information of the sub-area number to be preferentially displayed, information of the sub-area number of the sub-priority display, and Information about the sub-area number that is not displayed.
  • the saved content can be in one of the following ways:
  • the new data is priority_view_tile_num, and the meaning of the data is: the number of sub-regions to be displayed preferentially under the current viewpoint.
  • the data arrangement in the viewport_table[][] table is modified, and the sub-area near the current viewpoint is placed in front of the corresponding line of the current viewpoint, as follows:
  • the data in the table is changed to 14, 15, 26, 27, 13, 6, 16, 28, 36 corresponding to the viewing angle of the viewpoint shown in Fig. 9 at a latitude of 0° and a longitude of 150°.
  • 25,5,7,35,37,0,0,0,0, sub-area numbers 14, 15, 26, 27 that are closer to the viewpoint are placed in front, and the far sub-areas number 13, 6, 16, 28 , 36, 25 placed in the middle, the farthest sub-area number 5, 7, 35, 37 placed last.
  • Sub-areas that are close to the viewpoint are displayed preferentially, and sub-areas that are farther from the viewpoint are not displayed preferentially, and can be displayed with priority.
  • the terminal may obtain the encapsulated code stream for decoding and displaying. Therefore, as shown in FIG. 10, the method in this embodiment of the present application may further include:
  • the terminal determines location information of each sub-area of the panoramic image.
  • the terminal may receive the first information sent by the server, where the first information includes a track of each sub-area of the panoramic image and a code stream of each sub-area, and the track includes location information of all sub-areas of the panoramic image.
  • the terminal parses the position information of each sub-area in the panoramic image according to the track analysis.
  • the track may be the track of the stitched image in the first mode, and the terminal may parse the location information of all the sub-regions by parsing the syntax defined in the RectRegionPacking(i) in the track of the stitched image.
  • the terminal may save the location information of each sub-area according to the foregoing manner for storing the location information of the sub-area, that is, the track corresponding to the sub-area, that is, the tile track, and the terminal may analyze The area defined in the SubPictureCompositionBox in each tile track resolves the position information of the current sub-area.
  • the terminal may receive the MPD sent by the server, the MPD includes location information of each sub-area, or an address of the MPD including a private file, and the private file includes location information of each sub-area, and the terminal parses the MPD to obtain the sub-areas of each sub-area. location information.
  • the terminal can obtain the code stream corresponding to each sub-area first, the location information of the sub-area exists in the SEI corresponding to the sub-area. That is to say, when the terminal requests the code stream of the required sub-area, the terminal can acquire the location information of the sub-area according to the SEI in the code stream.
  • the terminal determines, according to the determined location information of each sub-area, location information of the sub-area covered by the current view in the panoramic image.
  • the terminal may acquire the location information of the sub-area covered by the current view in the panoramic image according to the matching relationship of the position information of the sub-area covered by the view angle and the view angle.
  • the terminal determines a first sampling interval of the sub-area.
  • the terminal may determine that the preset sampling interval is the first sampling interval, or the terminal receives the first sampling interval from the server, or the terminal may obtain the first sampling interval according to the location information of each sub-region received from the server, that is, each
  • the location information of the sub-area and the first sampling interval may have a preset calculation rule to obtain a first sampling interval corresponding to each sub-area.
  • the calculation rule may be a ratio of a size in the position information of the sub-area in the source image to a size in the position information of the sub-area in the spliced image, that is, a first sampling interval.
  • the terminal acquires a code stream corresponding to the sub-area covered by the current view according to the determined sub-area location information of the current view.
  • the terminal may directly obtain the code stream of the sub-area covered by the current view from the memory of the terminal.
  • the terminal requests the server to obtain a code stream corresponding to the sub-area covered by the current view.
  • the terminal may send the information indicating the current view to the server, and the server may obtain the sub-area covered by the current view according to the current view and the location information of the sub-area that can be covered by the current view, and then cover the current view required by the terminal.
  • the code stream corresponding to the sub-area is sent to the terminal, for example, the server may send the code stream spliced by the sub-area code stream to be transmitted to the terminal.
  • the terminal may obtain the sub-area covered by the current view according to the current view and the position information of the sub-area covered by the current view, and may send the number of the sub-area covered by the current view to the server, and the server may use the number required by the terminal according to the number.
  • the code stream of the area is sent to the terminal.
  • the terminal may obtain a code stream corresponding to the sub-area covered by the current view from the server according to a protocol preset by the terminal and the server, where the protocol includes a correspondence between the view angle and the sub-area covered by the view. This application does not limit the manner in which the terminal obtains the required code stream.
  • the terminal decodes the code stream to obtain an image of the sub-area covered by the current view.
  • the server Since the server performs horizontal and vertical division and longitudinal down-sampling processing on the latitude and longitude map, the pixels in the sub-area are de-redundant processed, so that the pixel redundancy of the sub-region to be transmitted is reduced, and the pixel value is reduced. Then, for the decoding terminal, when the code stream of the sub-area covered by the current view is acquired, the decoding capability can be reduced, and the complexity of the decoding is reduced, so that the decoding speed is improved.
  • the terminal resamples the decoded image according to the location information of the sub-area covered by the determined current perspective and the first sampling interval.
  • the terminal plays the resampled image.
  • the terminal can be based on Corresponding relationship between the number of the sub-area and the code stream obtains the code stream corresponding to the required sub-area, including numbers 1, 3, 4, 5, 6, 15, 19, 20, 21, 22, 23, 24, 34, 35 Sub-streams of 36, 37, as shown in (c) of FIG. 11, further, after the terminal decodes the sub-streams, the decoded image may be resampled according to the position information and the first sampling interval, and then The resampled image is played back as shown in (d) of FIG.
  • the photographing device communicating with the server may include two groups, one set of photographing devices for acquiring panoramic video of the left eye and another group of photographing devices for acquiring the right eye. Panoramic video.
  • the division of the sub-regions of the 3D warp and weft map can be as shown in FIG.
  • the latitude and longitude diagram of the left eye is the upper half of Fig. 12, and the latitude and longitude diagram of the right eye is the lower half of Fig. 12.
  • the latitude and longitude map corresponding to the left eye can be spliced together with the latitude and longitude map corresponding to the right eye, which is a latitude and longitude map, or can be separated, and is two latitude and longitude maps.
  • the server may divide the latitude and longitude map corresponding to the left eye and the latitude and longitude map corresponding to the right eye to perform horizontal division and vertical division of the latitude and longitude diagram corresponding to the left eye, and horizontal division and vertical division of the latitude and longitude diagram corresponding to the right eye.
  • the horizontal division of the left-eye latitude and longitude diagram of the 3D latitude and longitude diagram reference may be made to the implementation of the step 401.
  • the horizontal division of the latitude and longitude diagram of the right eye may also refer to the implementation manner of the step 401, and details are not described herein.
  • the horizontal and vertical division of the left-eye latitude and longitude diagram of the 3D latitude and longitude image reference may be made to the implementation of the step 402.
  • the vertical division of the right-eye latitude and longitude diagram may also refer to the implementation of the step 402, and details are not described herein.
  • sampling of the sub-regions of the 3D latitude and longitude latitude and longitude latitude and longitude image reference may be made to the implementation of the steps 404 to 405.
  • the sampling of the sub-regions of the right-eye latitude and longitude image may also refer to the implementation manners of the steps 404 to 405, and details are not described herein.
  • each sub-region is divided into the original image as a sub-image, and each sub-image sequence is independently encoded to generate 84 sub-code streams.
  • the third type the left-eye latitude and longitude image and the right-eye latitude and longitude image in the corresponding portion of the sub-region of the same position as a group of sub-regions, the image is spliced and then independently coded to generate 42 sub-code streams.
  • the decoding process of the video content by the terminal is different from that described above for the 2D latitude and longitude picture: the position information of the sub-area covered by the current view here includes the left eye image sub-area and the right eye image sub-area position information.
  • the code stream of the sub-area covered by the current view includes the code stream of the sub-area in the left-eye latitude and longitude picture and the code stream of the sub-area in the right-eye latitude and longitude picture.
  • the value of the current angle of view may take the angle of view of the left eye or the point of view of the right eye, and no limitation is imposed here.
  • Resampling is to resample the image of the sub-area covered by the left eye of the current view, and re-sample the image of the sub-area covered by the right eye of the current view, and render the desired left-eye sub-area and right-eye sub-area. display.
  • the above method can also be applied to the latitude and longitude map of the 360-degree panoramic video.
  • the latitude and longitude map can also be part of the latitude and longitude map of the 360-degree panoramic video image, for example, the division of the warp and longitude map can also be applied to the 180° half-panoramic video.
  • the 180° half-panoramic video is a panoramic video with a longitude range of 180° containing half of the panoramic video.
  • the manner of horizontally dividing the warp and weft diagram of the 180° half-pan video may refer to the above step 401.
  • the achievable manner of the above step 402 may be different for the latitude.
  • Sub-images ranging from -90° to -60° and between 60° and 90° can be left without longitudinal division, maintaining a single sub-area; for latitudes ranging from -60° to -30° and between 30° and 60°
  • the sub-images are divided into longitudinal divisions with a longitude of 60°, and the sub-images are longitudinally divided to obtain three sub-regions; for sub-images with a latitude range of -30° to 0° and between 0° and 30°, longitudinal division is performed with a longitude of 30°.
  • the sub-images are divided vertically to obtain 6 sub-regions.
  • the sub-area division of the latitude and longitude map of the entire 180° half-pan video is completed, and a total of 20 sub-areas are obtained.
  • the sub-region of the latitude and longitude map of the 180° half-panel video may also be down-sampled and then encoded, which may be the same as the implementation of the above step 404.
  • the implementation of the step 405 may be different from that in FIG. 13 ( a)
  • the longitudinal direction is unchanged, the lateral sampling is performed, the scaling factor is 1/6;
  • the latitude range -60° to Sub-images between -30° and 30° to 60° likewise, longitudinally invariant, down-sampling in the lateral direction, with a scaling factor of 1/2; for latitudes ranging from -30° to 0° and 0° to 30°
  • Sub-images are not scaled.
  • the resulting zoomed image can be as shown in (b) of FIG.
  • the above-mentioned sub-area division method for the warp and latitude diagram of the 180° half-pan video can also be applied to the sub-area division of the latitude and longitude map of the 3D 180° half-pan video, which is similar to the 360° panoramic video, and the latitude and longitude diagram of the 3D180° half-pan video is also The latitude and longitude view of the 180° half panoramic video of the left eye and the latitude and longitude view of the 180° half panoramic video of the right eye.
  • the latitude and longitude diagram of the left eye and the latitude and longitude diagram of the right eye can be spliced together, as shown in Fig. 14, the latitude and longitude diagram of the left eye is the left half of Fig.
  • the server can The latitude and longitude diagram of the left eye and the latitude and longitude diagram of the right eye are first divided, as shown by the dashed line in FIG. Then, the latitude and longitude diagram of the left eye is divided according to the division of the latitude and longitude diagram of the 180° half-pan video, and the latitude and longitude diagram of the right eye is also divided according to the division of the latitude and longitude diagram of the 180° half-pan video, and finally obtained.
  • the 20 sub-regions corresponding to the latitude and longitude map of the left eye and the 20 sub-regions corresponding to the latitude and longitude map of the right eye have a total of 40 sub-regions.
  • the server may obtain a latitude and longitude map corresponding to the panoramic video or the semi-panoramic video according to the video signal captured by the photographing device, and the server in the embodiment of the present application may further provide a direct division and obtain an image in the spherical panoramic signal.
  • Sub-area method Since the source image is a spherical signal map, or a spherical map, the sub-region division method also changes in the code stream encapsulation mode.
  • the spherical area specifies the signal position in a latitude and longitude manner, the agreed longitude range is 0 to 360°, and the latitude range is -90° to 90° (negative degrees indicate south latitude, and positive degrees indicate north latitude).
  • the embodiment of the present application provides a method for processing an image, as shown in FIG.
  • the server divides the spherical image of the image to be processed into a horizontal division, and the horizontally divided division position is a preset latitude.
  • the server may slice the spheroids horizontally at latitudes of -60°, -30°, 0°, 30°, and 60° on the spherical surface, respectively. As shown in (a) of Fig. 16.
  • the server processes the spherical image of the image to be vertically divided.
  • the division position of the longitudinal division is determined by the latitude, and at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions, and the longitudinal division interval is the adjacent vertical direction.
  • the distance between the divided positions is divided to obtain each sub-region of the warp and latitude chart.
  • the longitudinal division is divided by 120°, and the spherical surface is vertically divided by the longitude line to obtain 3 spherical surfaces.
  • the server samples an image of each sub-area.
  • the server may first map the image of the sub-area to the two-dimensional plane image according to the preset size, so that the sub-regions of the latitude and longitude map may be sampled according to the first sampling interval and the second sampling interval.
  • the mapping from the three-dimensional spherical map to the two-dimensional latitude and longitude map may be: an image of the sub-area divided by the spherical image, uniformly sampled in the longitudinal direction according to the preset size height, and uniformly sampled in the horizontal direction according to the preset size width. . Then, the image of each sub-area after uniform sampling can be sampled in the horizontal direction according to the first sampling interval, and the image is sampled in the vertical direction according to the second sampling interval.
  • image signal mapping is performed on all sub-regions on the spherical surface corresponding to sub-regions of (a) in the figure 16 such that each sub-region on the spherical map corresponds to a sub-region of the mapping image, that is, a two-dimensional latitude and longitude map.
  • the down-and-down plot is downsampled.
  • mapping from spherical signals to sub-area images there is no limitation here.
  • One of the ways may be: in the latitude direction, for each spherical sub-region, the spherical signal is in accordance with the preset size of the sub-region image of (b) in FIG.
  • Uniform mapping is performed at a high level, and uniform mapping can be understood as uniform sampling.
  • the spherical signal In the longitude direction, for a sub-spherical area between latitude range -90° to -60° and 60° to 90°, the spherical signal is downsampled according to 1/4 of the sampling rate in the latitude direction, that is, the scaling factor is 1/ 4; for a spheroidal region between latitude range -60° to -30° and 30° to 60°, the spherical signal is downsampled according to 1/2 of the sampling rate in the latitude direction, that is, the scaling factor is 1/2; For a sub-spherical area with a latitude range of -30° to 0° and between 0° and 30°, the spherical signal is mapped at the same sampling rate in the latitude direction, ie the scaling factor is one.
  • the image of the finally obtained sampled latitude and longitude image is as shown in (b) of FIG.
  • the server adjusts the positions of the sampled sub-areas such that the adjusted lateral edges of the images of the sub-areas are aligned, and the longitudinal edges are aligned. For example, as shown in (c) of FIG. Step 1504 may also choose not to execute.
  • the server encodes the stitched image into a tile.
  • step 1505 For the implementation of the step 1505, refer to the foregoing step 407, and details are not described herein again.
  • the encapsulation manner of the code stream of each sub-area may be the same as that of the above step 408, and the manner of saving the position information of the sub-area may be the same, except that when the sub-area is When the spherical image of the image to be processed is laterally divided and vertically divided, the sampled sub-regions form a sampled spherical image, and the position information includes the position of the sub-region in the image of the spherical image and the latitude and longitude range, and the sub-region is sampled.
  • the position and size in the image of the rear spherical image; or, the position information includes the position of the sub-region in the spherical image and the latitude and longitude range, and the position and size of the sub-region in the spliced image.
  • the semantic modification of the variables for the above sub-area partitioning method is as follows:
  • Proj_reg_width[i], proj_reg_height[i] describes the corresponding latitude and longitude range of the i-th sub-region in the source image, ie, the spherical map, that is, the sub-region in (b) of FIG. 16 is (a) in FIG.
  • the corresponding latitude and longitude range, such as the first sub-area in the upper left corner of (b) in FIG. 16, has a latitude and longitude range in the source image of (120°, 30°).
  • Proj_reg_top[i], proj_reg_left[i] describes the corresponding position of the upper left pixel of the i-th sub-region in the spherical map, expressed in latitude and longitude, that is, the upper left point of the sub-region in (b) of FIG. 16 is in FIG.
  • the corresponding position in (a) is, for example, the position of the first sub-area in the spherical map is (0°, 90°).
  • Proj_tile_width, proj_tile_height describes the latitude and longitude range of the sub-area of the current track in the spherical map, that is, the latitude and longitude range of the current sub-area in (a) of FIG. 16;
  • Proj_width, proj_height describes the latitude and longitude range of the spherical image, such as 360° panoramic spherical latitude and longitude range (360°, 180°).
  • Pic_width indicates the range of the longitude of the spherical map.
  • Pic_height indicates the latitude range of the spherical map.
  • Tile_pic_width[] An array representing the longitude range of each sub-area in the spherical map. The number of elements should be the tile_num value.
  • Tile_pic_height[] An array representing the latitude range of each sub-region in the spherical map. The number of elements should be the tile_num value.
  • Src_pic_width represents the range of the longitude of the spherical map, that is, the range of the longitude of the spherical map in (a) of Fig. 16.
  • Src_pic_height represents the latitude range of the spherical image, that is, the latitude range of the spherical image in (a) of Fig. 16.
  • Src_tile_width represents the longitude range of the current sub-region on the spherical map.
  • Src_tile_height represents the latitude range of the current sub-region on the spherical map.
  • the image is less redundant than the manner of evenly dividing the latitude and longitude image, and the image redundancy is reduced, which can greatly improve the transmission efficiency of the tile-wise code.
  • the maximum decoding capability required by the terminal decoder is also reduced, which makes it possible to perform higher-resolution source images for encoding transmission and presentation under existing decoding capabilities. Taking a uniform division of 6 ⁇ 3 as an example, the proportion of pixels that need to be transmitted is up to 55.6%. If the resolution of the source image is 4K (4096 ⁇ 2048), the decoder capability needs to reach about 4K ⁇ 1K. With the method described in the present application, the pixel ratio of transmission can be up to 25%.
  • the decoding capability of the decoder needs 2Kx1K. Moreover, the performance improves the speed of decoding and playing, and the decoding scheme of the present application is more efficient than the uniform partitioning scheme.
  • the embodiment of the present application further provides an image processing method, which is applied to a server, as shown in FIG. 17A, and includes:
  • the server saves a code stream corresponding to an image of each sub-region of the panoramic image or a sub-region of the spherical image, and the sub-region is obtained by laterally dividing and vertically dividing the latitude and longitude image or the spherical image of the panoramic image, wherein the horizontally divided dividing position is obtained.
  • the division position of the longitudinal division is determined by the latitude, and at least two longitudinal division intervals exist in the area formed by the adjacent lateral division division positions, and the longitudinal division interval is between the adjacent longitudinal division division positions. the distance.
  • the server sends, to the terminal, a code stream of the sub-area covered by the current view in the code stream corresponding to the saved image of each saved sub-area requested by the terminal.
  • the image corresponding to the sub-area stored in the server is sampled in the horizontal direction according to the first sampling interval before being encoded; wherein the higher the latitude corresponding to the sub-area, the larger the first sampling interval; or, according to the first
  • the two sampling intervals are sampled in the longitudinal direction.
  • the server in this embodiment can save the code stream corresponding to the image of each sub-area after the image processing by the server in the above embodiment, and the sub-area division mode and sampling adopted by the server for processing the image in the above embodiment
  • the process can reduce the occupied bandwidth when transmitting the code stream, reduce the decoding capability of the decoding end, reduce the decoding complexity, and improve the decoding speed.
  • the bandwidth occupied by the server in this embodiment is relatively high. The technology will be reduced and the decoding speed of the terminal will be improved.
  • each network element such as a server, a terminal, etc.
  • each network element includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
  • the embodiment of the present application may divide the function module into the server, the terminal, and the like according to the foregoing method example.
  • each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
  • FIG. 17 is a schematic diagram showing a possible structure of a server involved in the foregoing embodiment, where the server 17 includes: a dividing unit 1701, an encoding unit 1702, a sampling unit 1703, A splicing unit 1704, a packaging unit 1705, and a transmission unit 1706.
  • the dividing unit 1701 can be used to support the server to execute the processes 401, 402 in FIG. 4, and the encoding unit 1702 can be used to support the server to execute the process 403 in FIG. 4, the process 407 in FIG.
  • the sampling unit 1703 can be used to support the server to perform the processes 404, 405 in FIG. 7, the tiling unit 1704 is used to support the server to perform the process 406 in FIG.
  • the encapsulation unit 1705 can be used to support the server to perform the process 408 in FIG. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.
  • FIG. 18 shows a possible structural diagram of the server involved in the above embodiment.
  • the server 18 includes a processing module 1802 and a communication module 1803.
  • the processing module 1802 is configured to control and manage the actions of the server.
  • the processing module 1802 is configured to support the server to execute the processes 401, 402, 403, 404, 405, 406, 407, and 408 in FIG. 4, and/or for the purposes of this document.
  • the communication module 1803 is for supporting communication between the server and other network entities, such as communication with the terminal.
  • the server may also include a storage module 1801 for storing program code and data of the server.
  • the processing module 1802 may be a processor or a controller, such as a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit (Application-Specific). Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the communication module 13803 can be a transceiver, a transceiver circuit, a communication interface, or the like.
  • the storage module 1801 may be a memory.
  • the server involved in the embodiment of the present application may be the server shown in FIG.
  • the server 19 includes a processor 1912, a transceiver 1913, a memory 1911, and a bus 1914.
  • the transceiver 1913, the processor 1912, and the memory 1911 are connected to each other through a bus 1914.
  • the bus 1914 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. Wait.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 19, but it does not mean that there is only one bus or one type of bus.
  • FIG. 20 is a schematic diagram showing a possible structure of the terminal involved in the foregoing embodiment.
  • the terminal 20 includes: an obtaining unit 2001, a decoding unit 2002, and a resampling unit 2003. And a play unit 2004.
  • the obtaining unit 2001 is configured to support the terminal to execute the processes 101, 102, 103, and 104 in FIG. 10.
  • the decoding unit 2002 is configured to support the terminal to execute the process 105 in FIG. 10
  • the resampling unit 2003 is configured to support the terminal to execute the process in FIG. 106.
  • the playing unit 2004 is configured to support the terminal to execute the process 107 in FIG. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.
  • FIG. 21 shows a possible structural diagram of the terminal involved in the above embodiment.
  • the terminal 21 includes a processing module 2102 and a communication module 2103.
  • the processing module 2102 is configured to control management of actions of the terminal, for example, the processing module 2102 is configured to support the terminal to perform the processes 101-106 of FIG. 10, and/or other processes for the techniques described herein.
  • the communication module 2103 is for supporting communication between the terminal and other network entities, such as communication with a server.
  • the terminal may further include a storage module 2101 for storing program codes and data of the terminal, and a display module 2104 for supporting the terminal to execute the process 107 in FIG.
  • the processing module 2102 can be a processor or a controller, such as a CPU, a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the communication module 2103 can be a transceiver, a transceiver circuit, a communication interface, or the like.
  • the storage module 2101 can be a memory.
  • Display module 2104 can be a display or the like.
  • the processing module 2102 is a processor
  • the communication module 2103 is a transceiver
  • the storage module 2101 is a memory
  • the display module 2104 is a display
  • the terminal involved in the embodiment of the present application may be the terminal shown in FIG.
  • the terminal 22 includes a processor 2212, a transceiver 2213, a memory 2211, a display 2215, and a bus 2214.
  • the transceiver 2213, the processor 2212, the display 2215, and the memory 2211 are connected to each other through a bus 2214.
  • the bus 2214 may be a PCI bus or an EISA bus.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in FIG. 22, but it does not mean that there is only one bus or one type of bus.
  • FIG. 23 is a schematic diagram showing a possible structure of a server involved in the foregoing embodiment.
  • the server 23 includes: a storage unit 2301, a transmission unit 2302, and a storage unit 2301.
  • the transfer unit 2302 is configured to support the server to execute the process 17A2 in FIG. 17A in the support server executing the process 17A1 in FIG. 17A. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.
  • FIG. 24 shows a possible structural diagram of the server involved in the above embodiment.
  • the server 24 includes a storage module 2402 and a communication module 2403.
  • the storage module 2402 is configured to store program code and data of the server, for example, the program is used to execute the process 17A1 in FIG. 17A, and the communication module 2403 is used to execute the process 17A2 in FIG. 17A.
  • the server involved in the embodiment of the present application may be the terminal shown in FIG. 25.
  • the server 25 includes a transceiver 2511, a memory 2512, and a bus 2513.
  • the transceiver 2511 and the memory 2512 are connected to each other through a bus 2513.
  • the bus 2513 may be a PCI bus or an EISA bus.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 25, but it does not mean that there is only one bus or one type of bus.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules, which may be stored in a random access memory (RAM), a flash memory, a read only memory (ROM), an erasable programmable read only memory ( Erasable Programmable ROM (EPROM), electrically erasable programmable read only memory (EEPROM), registers, hard disk, removable hard disk, compact disk read only (CD-ROM) or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in a core network interface device.
  • the processor and the storage medium may also exist as discrete components in the core network interface device.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Social Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Instructional Devices (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请实施例提供一种图像的处理方法、终端和服务器,涉及媒体标准和媒体应用技术领域,能够解决图像采样时,采用经纬图均匀划分图像造成编码传输时编码效率低以及带宽浪费的问题。其方法为:对待处理图像的经纬图或球面图进行横向划分和纵向划分,以得到经纬图或球面图的各个子区域,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;对得到的各个子区域的图像进行编码。本申请实施例用于图像的Tile-wise编码传输。

Description

一种图像的处理方法、终端和服务器
本申请要求于2017年07月31日提交中国专利局、申请号为201710645108.X、申请名称为“一种图像的处理方法、终端和服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及媒体标准和媒体应用技术领域,尤其涉及一种图像的处理方法、终端和服务器。
背景技术
在视频应用中,虚拟现实(virtual reality,VR)/360度全景视频正在兴起,给人们带来了新的观看方式和视觉体验,同时也带来了新的技术挑战。360度全景视频由多个摄像机对物体进行多角度拍摄,支持多角度播放。其图像信号可以虚拟为一种球面信号,如图1所示,球面中不同位置的球面图像信号可以表示不同的视角内容。然而,虚拟的球面图像信号无法直观被人眼所见,因此需要将三维的球面图像信号表示为二维平面图像信号,例如通过经纬图、立方体等表述形式表示。这些表示形式实际上是将球面图像信号通过某种映射方式映射到二维的图像上,使其变为人眼所能直观看到的图像信号,最常用的直观图像格式为经纬图。该图像的采集方式是,水平方向上根据经度角对球面图像信号均匀采样,垂直方向上根据纬度角进行均匀采样,以地球的球面图像信号为例,其获得的二维映射图像如图2所示。
在VR应用中,球面图像信号为360度全景图像,而人眼的视角范围通常约为120度,因此人眼视角下看到的有效球面信号约为全景信号的22%。VR终端设备(如VR眼镜)可支持的单视角大约在90度至110度之间,能获得较好的用户观看体验。然而,由于用户观看图像时单个视角内的图像内容信息占整幅全景图像的小部分,视角外的图像信息对用户来说并没做任何使用,若将所有全景图像进行传输,会造成不必要的带宽浪费。因此,在全景视频基于视角的视频编码(viewport dependent video coding,VDC)的编码传输技术中,将整个视频中的图像进行划分,并根据用户的当前视角选择需要进行传输的图像子区域,从而达到节省带宽的目的。
上述全景视频VR编码传输技术可以包括2种:1)单独使用Tile-wise编码传输方式;2)全景图像编码与Tile-wise编码传输方式混合编码传输。其中,Tile-wise编码传输方式,是指将图像序列划分为一些图像子区域,分别将所有子区域进行独立编码,生成单个或多个码流。其中,在针对经纬图进行均匀划分方式中,包括将经纬图在宽高方向均匀划分为多个Tile,在客户端,当用户观看某一视角的图像时,客户端根据用户视角位置计算该视角在图像上所覆盖的范围,并根据该范围获得图像所需传输的Tile信息,包括Tile在图像中的位置和尺寸等,从服务端请求这些Tile对应的码流进行传输,从而在客户端对当前视角进行渲染和显示。但是,采用经纬图进行划分时,在赤道附近图像的采样率较高,在两极部分图像的采样率较低,即赤道附近的图 像像素冗余度较低,两极部分的图像像素冗余度较高,而且越往高纬度其冗余度越大,如果采用经纬图进行均匀划分,并未考虑到经纬图在不同纬度下的像素冗余度问题,对每个图像块以相同的条件在相同分辨率下进行编码传输,编码效率低,也会对传输带宽的浪费较大。
发明内容
本申请实施例提供一种图像的处理方法、终端和服务器,能够解决图像采样时,采用经纬图均匀划分图像造成编码传输时编码效率低以及带宽浪费的问题。
第一方面,提供一种图像的处理方法,应用于服务器,包括:对待处理图像的经纬图或球面图进行横向划分和纵向划分,以得到经纬图或球面图的各个子区域,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;对得到的各个子区域的图像进行编码。这样,相对于现有技术对经纬图按照相同的划分间隔进行均匀划分的方式中,均匀划分时划分的细致的特性导致的编码效率低,编码后传输时占用的带宽大的问题,本申请这种不同纬度之间按照至少两种纵向划分间隔进行纵向划分的方式,避免如现有技术中均匀划分细致的特征,本申请可以按照多种纵向划分间隔纵向划分,能够使得图像的子区域有多种大小,划分间隔越大,子区域越大,编码时的编码效率得到提升,编码后服务器向终端传输码流时的占用的带宽减小。
在一种可能的设计中,纵向划分的划分位置由纬度确定包括:纵向划分的划分位置所处的纬度越高,纵向划分间隔越大。这样由于子区域所在的纬度的不同,纬度高的子区域越大,划分越粗略,可以使得编码传输效率得到提升,传输带宽减小。
在一种可能的设计中,在对得到的各个子区域的图像进行编码之前,该方法还包括:对子区域的图像按照第一采样间隔在横向进行采样;其中,子区域所对应的纬度越高,第一采样间隔越大;对得到的各个子区域的图像进行编码,包括:对采样后的各个子区域的图像进行编码。由于经纬图赤道附近的图像的像素冗余度低,两极部分的图像的像素冗余度高,如果每个子区域在相同分辨率下进行编码传输,对传输带宽浪费大,另外,解码端像素冗余高,会使得解码端对于解码能力要求高,解码速度低。而本申请在可以在编码前进行横向采样,且横向采样时,子区域对应的纬度越高,第一采样间隔越大,即对高纬度的子区域在横向进行下采样,也即压缩采样,能够使得高纬度的子区域在编码前传输的图像像素冗余度降低,达到减小带宽的目的,同时下采样使得需要编码传输的像素值减少,使得解码端对于解码能力的要求降低,解码复杂度下降,从而使得解码速度得到提升。
在一种可能的设计中,在对得到的各个子区域的图像进行编码之前,该方法还包括:对子区域的图像按照第二采样间隔在纵向进行采样。其中,第二采样间隔可以是与采样前子区域的间隔相同,即在纵向保持原采样,也可以比采样前子区域的间隔小,即在纵向整体进行下采样,同理,也可以使得编码传输的带宽较小,解码端解码时的复杂度降低,解码速度得到提升。
在一种可能的设计中,当子区域由对待处理图像的球面图进行横向划分和纵向划分得到时,在对子区域的图像按照第一采样间隔在横向进行采样之前,该方法还包括: 将子区域的图像按照预设尺寸映射为二维平面图像;对子区域的图像按照第一采样间隔在横向进行采样,包括:对子区域的图像映射的二维平面图像按照第一采样间隔在横向进行采样。也就是说,如果服务器从拍摄设备采集到的是球面图时,可以先将球面图的子区域图像映射到二维的经纬图,而后对经纬图进行下采样,这样假设服务器从拍摄设备直接采集到球面信号,就可以直接对球面图划分子区域,而后将球面图的子区域映射到经纬图,再对经纬图进行下采样。
在一种可能的设计中,在对采样后的各个子区域的图像进行编码之前,该方法还包括:调整采样后的各个子区域的位置,使得调整后的各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。这样,在拼接后的图像中可以按顺序对子区域进行编号,以根据各个子区域的编号方便对服务器和终端对各个子区域进行传输处理。
在一种可能的设计中,对采样后的各个子区域的图像进行编码包括:对拼接成的图像分片(Tile)进行编码。这样可以生成单个码流进行保存,或者将该单个码流进行切割后获得多个子区域进行保存。
在一种可能的设计中,在对得到的各个子区域的图像进行编码之后,该方法还包括:对通过编码获得的各个子区域的图像对应的码流进行独立封装,并且编码各个子区域的位置信息;其中,编码后的所有子区域的位置信息与所有子区域的码流存在于一个轨迹中;或者,编码后的每个子区域的位置信息与码流存在于各自的轨迹中;或者,编码后的所有子区域的位置信息存在于媒体呈现描述(MPD);或者,编码后的所有子区域的位置信息存在于私有文件中,且私有文件的地址存在于MPD中;或者,编码后的每个子区域的位置信息存在于每个子区域的码流的辅助增强信息(SEI)中。
在一种可能的设计中,当子区域由对待处理图像的经纬图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的经纬图,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在采样后的经纬图中的位置以及尺寸大小;或者,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在拼接成的图像中的位置以及尺寸大小;或者,当子区域由对待处理图像的球面图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的球面图,位置信息包括子区域在球面图的图像中的位置以及经纬度范围,以及子区域在采样后的球面图的图像中的位置以及尺寸大小;或者,位置信息包括子区域在球面图图像中的位置以及经纬度范围,以及子区域在拼接成的图像中的位置以及尺寸大小。这样,终端可以根据子区域的位置以及尺寸大小在播放显示时进行图像渲染和呈现。
在一种可能的设计中,私有文件还包括用于表征用户视点与用户视点的视角覆盖的子区域的编号的对应关系的信息。对于终端来说,当终端确定用户视点时,就可以直接根据该对应关系确定该视点的视角覆盖的子区域,以根据该子区域的码流进行解码显示,可以提升终端解码时的解码速度。
在一种可能的设计中,私有文件还包括用于表征用户视角覆盖的子区域中需优先显示的子区域个数的信息、需优先显示的子区域编号的信息、次优先显示的子区域编号的信息以及不显示的子区域编号的信息。这样,当存在某些原因,例如网络不稳定导致所有的子区域码流无法完全获取或不需要完全获取时,终端可以优先获取离视点近的子区域的图像进行优先显示,而摒弃不作优先显示的子区域的图像数据。
在一种可能的设计中,经纬图包括左眼对应的经纬图和右眼对应的经纬图;在对待处理图像的经纬图或球面图进行横向划分和纵向划分之前,该方法还包括:将左眼对应的经纬图与右眼对应的经纬图进行分割;对待处理图像的经纬图或球面图进行横向划分和纵向划分,包括:对左眼对应的经纬图进行横向划分和纵向划分,以及对右眼对应的经纬图进行横向划分和纵向划分。这样,对于3D视频图像来说,也可以按照本申请的子区域划分方式,减小带宽以及提升编码传输时的效率。
在一种可能的设计中,该方法还包括:将通过编码获得的各个子区域的图像对应的码流发送给终端;或者,接收终端发送的视角信息,根据视角信息获取视角信息对应的子区域,将视角信息对应的子区域的码流发送给终端;或者,接收终端发送的子区域的编号,将子区域的编号对应的码流发送给终端。也即,终端可以从本地获取所需子区域的图像对应的码流,也可以从是服务器根据视角信息确定子区域后,将子区域对应的码流发送给终端,也可以是终端确定所需的子区域的编号后通知服务器,服务器将子区域对应的码流发送给终端,可以降低服务器的计算负载。
在一种可能的设计中,经纬图为360度全景视频图像的经纬图,或360度全景视频图像的经纬图的一部分;或球面图为360度全景视频图像的球面图,或360度全景视频图像的球面图的一部分。也即,本申请对子区域的划分方式也可以适用于180度半全景视频图像的划分,进而降低180度半全景视频图像传输时的带宽和提升编码传输效率。
第二方面,提供一种图像的处理方法,应用于终端,包括:确定全景图像各个子区域的位置信息;根据确定的各个子区域的位置信息,确定当前视角覆盖的子区域在全景图像中的位置信息,并确定子区域的第一采样间隔;根据确定的当前视角覆盖的子区域位置信息,获取当前视角覆盖的子区域对应的码流;对码流进行解码,以得到当前视角覆盖的子区域的图像;根据确定的当前视角覆盖的子区域的位置信息以及第一采样间隔对解码后的图像进行重采样,并对重采样后的图像进行播放。于是,采样间隔可以随子区域的位置变化,不会如现有技术中子区域是均匀划分,解码时按照既定的采样间隔解码呈现图像,本申请终端可以根据不同的采样间隔重采样图像进行显示,可以提升解码端图像的显示速度。
在一种可能的设计中,确定全景图像各个子区域的位置信息包括:接收服务器发送的第一信息,第一信息包括全景图像的各个子区域的轨迹以及各个子区域的码流,轨迹包括全景图像的所有子区域的位置信息;根据轨迹,得到全景图像中的各个子区域的位置信息。
在一种可能的设计中,确定全景图像中各个子区域的位置信息包括:接收服务器发送的媒体呈现描述(MPD),其中,MPD包括各个子区域的位置信息,或者,MPD中包括私有文件的地址,且私有文件包括各个子区域的位置信息;解析MPD,以获取各个子区域的位置信息。
在一种可能的设计中,子区域的位置信息存在于子区域对应的码流的辅助增强信息(SEI)中。
在一种可能的设计中,获取当前视角覆盖的子区域对应的码流,包括:从终端的存储器中获取当前视角覆盖的子区域对应的码流;或者,向服务器请求获取当前视角 覆盖的子区域对应的码流。
在一种可能的设计中,向服务器请求获取当前视角覆盖的子区域对应的码流,包括:将指示当前视角的信息发送给服务器,接收服务器发送的当前视角覆盖的子区域对应的码流;或者,按照终端与服务器预设的协议,从服务器获取当前视角覆盖的子区域对应的码流,协议包括视角与该视角覆盖的子区域的对应关系,这样可以根据该对应关系提升终端从服务获取子区域对应的码流的速度。
在一种可能的设计中,确定子区域的第一采样间隔,包括:确定预设的采样间隔为第一采样间隔;或者,从服务器接收第一采样间隔;或者,根据从服务器接收到的各个子区域的位置信息获取第一采样间隔,即各个子区域的位置信息不同时,对应的第一采样间隔也可以不同。
第三方面,提供一种图像的处理方法,应用于服务器,包括:保存全景图像的经纬图或球面图的各个子区域的图像对应的码流,子区域根据对全景图像的经纬图或球面图进行横向划分和纵向划分得到,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;向终端发送终端请求的保存的各个子区域的图像对应的码流中当前视角覆盖的子区域的码流。这样,该服务器保存的各个子区域的图像对应的码流在传输给终端时,由于本申请不同纬度之间按照至少两种纵向划分间隔进行纵向划分子区域的方式,可以避免如现有技术中均匀划分细致的特征,本申请可以按照多种纵向划分间隔纵向划分,能够使得图像的子区域有多种大小,划分间隔越大,子区域越大,编码时的编码效率得到提升,编码后服务器向终端传输码流时的占用的带宽减小。
在一种可能的设计中,保存于服务器的子区域对应的图像在被编码之前,被按照第一采样间隔在横向进行采样;其中,子区域所对应的纬度越高,第一采样间隔越大;或者,被按照第二采样间隔在纵向进行采样。这样
第四方面,提供一种服务器,包括:划分单元,用于对待处理图像的经纬图或球面图进行横向划分和纵向划分,以得到经纬图或球面图的各个子区域,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;编码单元,用于对得到的各个子区域的图像进行编码。
在一种可能的设计中,纵向划分的划分位置由纬度确定包括:纵向划分的划分位置所处的纬度越高,纵向划分间隔越大。
在一种可能的设计中,还包括采样单元,用于:对子区域的图像按照第一采样间隔在横向进行采样;其中,子区域所所对应的纬度越高,第一采样间隔越大;编码单元用于:对采样后的各个子区域的图像进行编码。
在一种可能的设计中,采样单元还用于:对子区域的图像按照第二采样间隔在纵向进行采样。
在一种可能的设计中,采样单元还用于:将子区域的图像按照预设尺寸映射为二维平面图像;采样单元,用于:对子区域的图像映射的二维平面图像按照第一采样间隔在横向进行采样。
在一种可能的设计中,还包括拼接单元,用于:调整采样后的各个子区域的位置,使得调整后的各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。
在一种可能的设计中,编码单元用于:对拼接成的图像分片(Tile)进行编码。
在一种可能的设计中,还包括封装单元,用于:对通过编码获得的各个子区域的图像对应的码流进行独立封装,并且编码各个子区域的位置信息;其中,编码后的所有子区域的位置信息与所有子区域的码流存在于一个轨迹中;或者,编码后的每个子区域的位置信息与码流存在于各自的轨迹中;或者,编码后的所有子区域的位置信息存在于媒体呈现描述(MPD);或者,编码后的所有子区域的位置信息存在于私有文件中,且私有文件的地址存在于MPD中;或者,编码后的每个子区域的位置信息存在于每个子区域的码流的辅助增强信息(SEI)中。
在一种可能的设计中,当子区域由对待处理图像的经纬图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的经纬图,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在采样后的经纬图中的位置以及尺寸大小;或者,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在拼接成的图像中的位置以及尺寸大小;或者,当子区域由对待处理图像的球面图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的球面图,位置信息包括子区域在球面图的图像中的位置以及经纬度范围,以及子区域在采样后的球面图的图像中的位置以及尺寸大小;或者,位置信息包括子区域在球面图图像中的位置以及经纬度范围,以及子区域在拼接成的图像中的位置以及尺寸大小。
在一种可能的设计中,私有文件还包括用于表征用户视点与用户视点的视角覆盖的子区域的编号的对应关系的信息。
在一种可能的设计中,私有文件还包括用于表征用户视角覆盖的子区域中需优先显示的子区域个数的信息、需优先显示的子区域编号的信息、次优先显示的子区域编号的信息以及不显示的子区域编号的信息。
在一种可能的设计中,经纬图包括左眼对应的经纬图和右眼对应的经纬图;划分单元,用于:将左眼对应的经纬图与右眼对应的经纬图进行分割;划分单元用于:对左眼对应的经纬图进行横向划分和纵向划分,以及对右眼对应的经纬图进行横向划分和纵向划分。
在一种可能的设计中,还包括传输单元,用于:将通过编码获得的各个子区域的图像对应的码流发送给终端;或者,接收终端发送的视角信息,根据视角信息获取视角信息对应的子区域,将视角信息对应的子区域的码流发送给终端;或者,接收终端发送的子区域的编号,将子区域的编号对应的码流发送给终端。
在一种可能的设计中,经纬图为360度全景视频图像的经纬图,或360度全景视频图像的经纬图的一部分;或球面图为360度全景视频图像的球面图,或360度全景视频图像的球面图的一部分。
第五方面,提供一种终端,包括:获取单元,用于确定全景图像各个子区域的位置信息;获取单元还用于,根据确定的各个子区域的位置信息,确定当前视角覆盖的子区域在全景图像中的位置信息,并确定子区域的第一采样间隔;获取单元,还用于根据确定的当前视角覆盖的子区域位置信息,获取当前视角覆盖的子区域对应的码流; 解码单元,用于对码流进行解码,以得到当前视角覆盖的子区域的图像;重采样单元,用于根据确定的当前视角覆盖的子区域的位置信息以及第一采样间隔对解码后的图像进行重采样;播放单元,用于重采样后的图像进行播放。
在一种可能的设计中,获取用于:接收服务器发送的第一信息,第一信息包括全景图像的各个子区域的轨迹以及各个子区域的码流,轨迹包括全景图像的所有子区域的位置信息;获取单元,还用于根据轨迹,得到全景图像中的各个子区域的位置信息。
在一种可能的设计中,获取用于:接收服务器发送的媒体呈现描述(MPD),其中,MPD包括各个子区域的位置信息,或者,MPD中包括私有文件的地址,且私有文件包括各个子区域的位置信息;解析MPD,以获取各个子区域的位置信息。
在一种可能的设计中,子区域的位置信息存在于子区域对应的码流的辅助增强信息(SEI)中。
在一种可能的设计中,获取单元用于:从终端的存储器中获取当前视角覆盖的子区域对应的码流;或者,向服务器请求获取当前视角覆盖的子区域对应的码流。
在一种可能的设计中,获取单元用于:将指示当前视角的信息发送给服务器,接收服务器发送的当前视角覆盖的子区域对应的码流;或者,按照终端与服务器预设的协议,从服务器获取当前视角覆盖的子区域对应的码流,协议包括视角与该视角覆盖的子区域的对应关系。
在一种可能的设计中,获取单元用于:确定预设的采样间隔为第一采样间隔;或者,从服务器接收第一采样间隔。
第六方面,提供一种服务器,包括:存储单元,用于保存全景图像的经纬图或球面图的各个子区域的图像对应的码流,子区域根据对全景图像的经纬图或球面图进行横向划分和纵向划分得到,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;传输单元,用于向终端发送终端请求的保存的各个子区域的图像对应的码流中当前视角覆盖的子区域的码流。
在一种可能的设计中,保存于服务器的子区域对应的图像在被编码之前,被按照第一采样间隔在横向进行采样;其中,子区域所对应的纬度越高,第一采样间隔越大;或者,被按照第二采样间隔在纵向进行采样。也就是说,对高纬度的子区域在横向进行下采样,也即压缩采样,能够使得高纬度的子区域在编码前传输的图像像素冗余度降低,达到减小带宽的目的,同时下采样使得需要编码传输的像素值减少,使得解码端对于解码能力的要求降低,解码复杂度下降,从而使得解码速度得到提升。
又一方面,本申请实施例提供了一种计算机存储介质,用于储存为上述服务器所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
又一方面,本申请实施例提供了一种计算机存储介质,用于储存为上述终端所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
又一方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面的方法。
本申请实施例提供一种图像的处理方法、终端和服务器,包括:对待处理图像的 经纬图或球面图进行横向划分和纵向划分,以得到经纬图或球面图的各个子区域,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离;对得到的各个子区域的图像进行编码。这样,相对于现有技术对经纬图按照相同的划分间隔进行均匀划分的方式中,均匀划分时划分的细致的特性导致的编码效率低,编码后传输时占用的带宽大的问题,本申请这种不同纬度之间按照至少两种纵向划分间隔进行纵向划分的方式,避免如现有技术中均匀划分细致的特征,本申请可以按照多种纵向划分间隔纵向划分,能够使得图像的子区域有多种大小,划分间隔越大,子区域越大,编码时的编码效率得到提升,编码后服务器向终端传输码流时的占用的带宽减小。
附图说明
图1为本申请实施例提供的一种360度全景图像信号的示意图;
图2为本申请实施例提供的一种360度全景图像信号的示意图转化为经纬图的示意图;
图3为本申请实施例提供的一种网络架构的示意图;
图4为本申请实施例提供的一种图像的处理方法的流程示意图;
图5为本申请实施例提供的一种经纬图划分为42个子区域时的示意图;
图6为本申请实施例提供的一种经纬图划分为50个子区域时的示意图;
图7为本申请实施例提供的一种图像的处理方法的流程示意图;
图8为本申请实施例提供的一种经纬图中的视点区域的示意图;
图9为本申请实施例提供的一种视角覆盖的子区域的示意图;
图10为本申请实施例提供的一种图像的处理方法的流程示意图;
图11为本申请实施例提供的一种终端解码显示过程的示意图;
图12为本申请实施例提供的一种3D经纬图的子区域的划分示意图;
图13为本申请实施例提供的一种180°半全景视频的经纬图横向划分的方式的示意图;
图14为本申请实施例提供的一种3D180°半全景视频的经纬图的子区域划分方式的示意图;
图15为本申请实施例提供的一种图像的处理方法的流程示意图;
图16为本申请实施例提供的一种球面全景信号中进行划分并获得图像子区域的方法的示意图;
图17A为本申请实施例提供的一种图像的处理方法的流程示意图;
图17为本申请实施例提供的一种服务器的结构示意图;
图18为本申请实施例提供的一种服务器的结构示意图;
图19为本申请实施例提供的一种服务器的结构示意图;
图20为本申请实施例提供的一种终端的结构示意图;
图21为本申请实施例提供的一种终端的结构示意图;
图22为本申请实施例提供的一种终端的结构示意图;
图23为本申请实施例提供的一种服务器的结构示意图;
图24为本申请实施例提供的一种服务器的结构示意图;
图25为本申请实施例提供的一种服务器的结构示意图。
具体实施方式
为了便于理解,示例地给出了部分与本申请相关概念的说明以供参考。如下所示:
全景视频:指VR全景视频,也可以称为360度全景视频或360视频,是一种用多个摄像机进行全方位360度进行拍摄的视频,用户在观看视频的时候,可以随意调节视频上下左右进行观看。
3D全景视频:指3D格式的VR全景视频,该视频包括两路360度全景视频,一路用于左眼显示,一路用于右眼显示,两路视频在同一帧中为左眼和右眼显示的内容有些许差异,使用户在观看时出现3D效果。
经纬图:即等距柱状投影图(Equirectangular Projection,ERP)。全景图像格式的一种,将球面信号按照等经度间隔和等纬度间隔均匀采样映射获得的能够用于保存和传输的二维全景图像。该图像的横纵坐标可以用经纬度来表示,宽度方向上可用经度表示,跨度为360°;高度方向上可用纬度表示,跨度为180°。
视频解码(video decoding):将视频码流按照特定的语法规则和处理方法恢复成重建图像的处理过程。
视频编码(video encoding):将图像序列压缩成码流的处理过程。
视频编码(video coding):video encoding和video decoding的统称,中文译名和video encoding相同。
Tile:指视频编码标准,即高效视频编码(High Efficiency Video Coding,HEVC)中针对待编码图像进行划分所得到的方块形编码区域,一帧图像可划分为多个Tile,多个Tile共同组成该帧图像。每个Tile可以独立编码。本申请中的Tile可以是应用了运动受限的Tile集合(motion-constrained tile sets,MCTS)技术的Tile。
MCTS:运动受限的Tile集合,是针对Tile的一种编码技术,该技术在编码时对Tile内部的运动矢量加以限制,使得图像序列中相同位置的Tile在时域上不会参考该Tile区域位置以外的图像像素,因此时域上各个Tile可以独立解码。
子图像(sub-picture):对整幅图像进行划分,获得原图像的一部分称为该图像的子图像。本申请中的子图像可以是形状为方形的子图像。
图像子区域:本申请中的图像子区域可以作为Tile或者子图像的统称,可以简称为子区域。
VDC:基于视角的视频编码,是针对全景视频编码的一种编码传输技术,即基于用户在终端所观看到的视角进行编码传输的方法。
Tile-wise编码:视频编码的一种方式,将图像序列划分为多个图像子区域,分别将所有子区域进行独立编码,生成单个或多个码流的过程。本申请中的Tile-wise编码可以是VDC中的Tile-wise编码。
track:可以译为“轨迹”,是指一系列有时间属性的按照国际标准化组织(International Standardization Organization,ISO)基本媒体文件格式(ISO base media file format,ISOBMFF)的封装方式的样本,比如视频track,即视频样本,是视频编码器编码每一帧后产生的码流,按照ISOBMFF的规范对所有的视频样本进行封装产 生样本。
box:可以译为“盒子”,在标准中是指面向对象的构建块,由唯一的类型标识符和长度定义。在某些规范中可以称为“原子”,包括MP4的第一个定义。box是构成ISOBMFF文件的基本单元,box可以包含其他的box。
辅助增强信息(supplementary enhancement information,SEI),是视频编解码标准(h.264,h.265)中定义的一种网络接入单元(Network Abstract Layer Unit,NALU)的类型。
MPD:标准ISO/IEC 23009-1中规定的一种文档,在该文档中包含了客户端构造超文本传输协议(HTTP,HyperText Transfer Protocol,HTTP)-统一资源定位符(Uniform Resource Locator,URL)的元数据。在MPD中包含一个或者多个周期(period)元素,每个period元素包含有一个或者多个自适应集(adaptationset),每个adaptationset中包含一个或者多个表示(representation),每个representation中包含一个或者多个分段,客户端根据MPD中的信息选择表达,并构建分段的HTTP-URL。
ISO基本媒体文件格式:是由一系列的box组成,在box中可以包含其他的box。在这些box中包含元数据box和媒体数据box,元数据box(moov box)中包含的是元数据,媒体数据box(mdat box)中包含的是媒体数据。元数据的box和媒体数据的box可以是在同一个文件中,也可以是在分开的文件中。
本申请实施例可以用于在全景视频或部分全景视频编码前的处理,以及编码后的码流进行封装的过程,在服务器和终端中都有涉及相应的操作和处理。
如图3所示,本申请的网络架构可以包括服务器31和终端32。与服务器31通信的还包括拍摄设备33,该拍摄设备可以用于拍摄360度全景视频,将视频传输给服务器31。服务器可以对全景视频进行编码前处理,而后进行编码或转码操作,再将编码后的码流封装为可传输的文件,将文件传输到终端或内容分发网络。服务器还可以根据终端反馈的信息(例如用户视角等),选择需要传输的内容进行信号传输。终端32可以为VR眼镜、手机、平板电脑、电视以及电脑等可以连接网络的电子设备。终端32可以接收服务器31发送的数据,并进行码流解封装以及解码显示等。
本申请为了解决图像处理时,采用经纬图均匀划分图像造成编码传输的带宽浪费和解码端解码能力以及速度受限的问题,可以提供一种图像的处理方法,该方法可以为基于多个图像子区域的经纬图Tile-wise划分与处理方法以及相对应的编码传输和解码呈现方式。在本申请实施例中,约定经纬图横向经度范围为0至360°,纵向纬度范围为-90°至90°。负数度数表示南纬,正数度数表示北纬。如图4所示,该方法可以包括:
编码前处理:
401、服务器待处理图像的经纬图进行横向划分,横向划分的划分位置为预先设定的纬度。
该图像可以是视频的多个序列图像。
以服务器根据拍摄设备采集到的视频得到该视频的经纬图如图5中的(a)所示为例,服务器在经纬图纵向上的纬度-60°、-30°、0°、30°以及60°处划纬度线,横向切分该经纬图,图5中的(a)中以X表示纬度值,经纬图赤道处为纬度0°,在北纬90° 与南纬-90°之间,北纬30°以及60°横向划分经纬图,南纬-60°以及-30°横向划分经纬图,横向划分间隔为30°。划分间隔也可以理解为划分步长。
402、服务器对待处理图像经纬图进行纵向划分,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离,得到经纬图的各个子区域。
一种可实现的方式中,纵向划分时,经纬图南纬部分,不同纬度之间的纵向划分间隔可以不同,南纬部分和北纬部分相应纬度之间的纵向划分间隔可以相同。纵向划分的划分位置所处的纬度越高,纵向划分间隔可以越大,也可以存在不同纬度之间纵向划分间隔相同的情况。
例如,对于横向划分的划分位置,南纬部分的纬度范围-90°至-60°以及北纬部分的纬度范围60°至90°之间的子图像,可以以经度120°为纵向划分间隔,纵向划分该子图像获得3个子区域;对于纬度范围-60°至-30°以及30°至60°之间的子图像,以经度60°为纵向划分间隔,纵向划分子图像获得6个子区域;对于纬度范围-30°至0°以及0°至30°之间的子图像,以经度30°为纵向划分间隔,纵向划分该子图像获得12个子区域。这样,对整个经纬图的子区域划分完成,共获得了42个子区域,如图5中的(a)所示。纵向划分间隔包括经度120°、经度60°以及经度30°。
另一种可实现的方式中,与上述方式中对子图像的划分方式不同,将经纬图可以划分为50个子区域。示例性地,对于纬度范围-90°至-60°之间以及60°至90°之间的子图像,不进行纵向划分,保持单个子区域;对于纬度范围-60°至-30°之间以及30°至60°之间的子图像,以经度30°为纵向划分间隔,纵向划分该子图像获得12个子区域;对于纬度范围-30°至0°之间以及0°至30°之间的子图像,以经度30°为纵向划分间隔,纵向划分子图像获得12个子区域。这样,对整个经纬图的子区域划分完成,获得50个子区域,如图6中的(a)图所示。划分步长包括经度30°以及经度0°。划分步长为经度0°即表示对子图像不进行纵向划分。
403、服务器对得到的各个子区域的图像进行编码。
这样,本申请这种不同纬度之间按照至少两种纵向划分间隔进行纵向划分的方式,相对于均匀划分时划分的细致的特性,导致的编码效率低,编码后传输时占用的带宽大的问题,本申请可避免如现有技术中划分细致的特征,可以按照多种纵向划分间隔纵向划分经纬图,能够使得子区域划分有多种大小,纵向划分间隔越大,子区域越大,例如可以是纵向划分的划分位置所处的纬度越高时,纵向划分间隔越大,子区域越大,编码时的编码效率得到提升,编码后服务器向终端传输码流时的占用的带宽减小。
进一步的,现有的对经纬图进行均匀划分的方式,对于解码端,即对于终端来说,终端得到的冗余像素较多,使得终端对最大解码能力的要求也会增加,解码速度也会有很大挑战,针对该问题,本申请还可以通过对进行非均匀划分后的子区域的像素进行去冗余度处理,即下采样,这样需要编码传输的像素点减少,解码端要求的最大解码能力降低,可以使得解码复杂度下降,提高解码器的解码速度。因此,如图7所示,本申请的实施方法在上述步骤403之前,还可以包括:
404、服务器对子区域的图像在纵向进行原采样,或对子区域的图像按照第二采样间隔在纵向进行采样。
例如对图5中的(a)或图6中的(a)所示的经纬图划分后的各个子区域来说,原采样可以理解为对各个子区域的图像纵向保持不变,不进行缩放处理,或者不进行处理。按照第二采样间隔进行采样,例如在纵向对各个子区域整体进行下采样,也可以理解为在纵向按照既定给的子区域高度进行采样。
405、服务器对子区域的图像按照第一采样间隔在横向进行采样,子区域所对应的纬度越高,第一采样间隔越大。
第一采样间隔和第二采样间隔可以是在服务器侧预设的,第一采样间隔和第二采样间隔可以相同,也可以不同。第一采样间隔可以理解为缩放系数的倒数,即每多个像素点采样一个像素点,以得到缩放后的图像。
例如对于图5中的(a)所示的经纬图来说,对于纬度范围-90°至-60°以及纬度范围60°至90°之间的子图像,横向进行下采样,第一采样间隔为4,即每4个像素点采样一个像素点,缩放系数为1/4;对于纬度范围-60°至-30°以及纬度范围30°至60°之间的子图像,同样进行横向下采样,缩放系数为1/2;对于纬度范围-30°至0°以及0°至30°之间的子图像,不进行横向缩放。最终得到的采样后的图像如图5中的(b)所示。需要说明的是,图5中的(b)即为将图5中的(a)纵向保持不变时,只将横向进行下采样时采样后的图像。该举例中,在横向采样时第一采样间隔与纬度成正比,也就是说,对于北纬图像部分,子区域对应的纬度越高,第一采样间隔越大,同样,对于南纬图像部分,纬度越高,第一采样间隔越大。南纬图像部分和北纬图像部分,相同的纬度对应的采样间隔相同。
再例如,对于图6中的(a)所示的经纬图来说,相比图5中的(b)中纵向进行下采样后的示意图,经纬图进行划分和缩放后的子区域大小在不同的纬度之间也可以为非均匀大小,这样可以突破图5中的(b)中缩放后的子区域大小为相同大小的限制,使得服务器在编码传输时的编码传输效率得到提升。具体来说,对于图6中的(a)所示的经纬图,对于纬度范围-90°至-60°以及纬度范围60°至90°之间的子图像,纵向不变,横向按照采样间隔进行下采样,缩放系数为1/4;对于纬度范围-60°至-30°以及30°至60°之间的子图像,同样保持纵向不变,进行横向下采样,缩放系数7/12;对于纬度范围-30°至0°以及0°至30°之间的子图像,不进行缩放,即纵向和横向都不进行缩放处理,最终得到的缩放图像如图6中的(b)所示。
可选的,由于缩放后的经纬图为不规则图像,如图5中的(b)和图6中的(b)所示,本申请还可以将缩放后的子区域进行重新摆放组合,形成预设图像,因此,该方法还可以包括:
406、服务器调整采样后的各个子区域的位置,使得调整后各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。
例如对于图5中的(b)所示的经纬图来说,调整位置后的图像可以如图5中的(c)所示。
步骤403可以替换为:
407、服务器对采样后的各个子区域的图像进行编码。
例如对于上述进行子区域划分和缩放后得到的图5中的(b)中的42个子区域,或者对于重新组合后的图5中的(c)中的42个子区域来说,对于可以对每个子区域 进行编码。这里可以包括两种编码方式:(1)子图像编码方式,即对每个子图像序列进行独立编码,生成42个子码流,即每个子图像对应一个码流。该子图像可以为上述子区域,即对该42个子区域分别进行独立编码,得到每个子区域对应的码流;(2)将整幅图像进行分片(Tile)模式编码,编码时可以使用MCTS技术,生成该整幅图像的单个码流进行保存,或者将单个码流进行切割后获得42个子码流进行保存。这里的整幅图像可以是将源经纬图进行采样缩放后的图像,如图5中的(b),也可以是将采样缩放后的图像进行重新组合后的规则图像,如图5中的(c)。
在对图像进行编码之后,服务器还需要对编码后得到的每个子区域的码流进行封装,因此,该方式还可以包括:
408、服务器对编码获得的各个子区域的图像对应的码流进行独立封装,并且编码各个子区域的位置信息。
服务器可以将所有子区域的码流独立封装在一个轨迹,即track中,例如封装在tile track中,也可以封装在各自对应的track中。子区域的位置信息可以理解为子区域划分方式的描述信息,编码后的所有子区域的位置信息与所有子区域的码流可以存在于一个track中;或者,编码后的每个子区域的位置信息与码流存在于各自的track中;或者,编码后的所有子区域的位置信息存在于媒体呈现描述(MPD);或者,编码后的所有子区域的位置信息可以存在于私有文件中,且私有文件的地址存在于MPD中;或者,编码后的每个子区域的位置信息存在于每个子区域的码流的辅助增强信息(SEI)中。本申请对于子区域位置信息的保存方式不做限定。
当子区域由对待处理图像的经纬图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的经纬图,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在采样后的经纬图中的位置以及尺寸大小;或者,位置信息包括子区域在经纬图中的位置以及尺寸大小,以及子区域在拼接成的图像中的位置以及尺寸。其中,尺寸大小可以包括宽和高。
下面对上述子区域的位置信息的各种保存方式分别进行说明。
方式一、对于所有子区域的位置信息保存在一个track中,可以是在拼接后的图像的track中添加所有子区域划分方式的描述信息,例如在拼接后的图像的track中的moov box中增加下述语法:
aligned(8)class RectRegionPacking(i){
unsigned int(16)proj_reg_width[i];
unsigned int(16)proj_reg_height[i];
unsigned int(16)proj_reg_top[i];
unsigned int(16)proj_reg_left[i];
unsigned int(3)transform_type[i];
bit(5)reserved=0;
unsigned int(16)packed_reg_width[i];
unsigned int(16)packed_reg_height[i];
unsigned int(16)packed_reg_top[i];
unsigned int(16)packed_reg_left[i];
}
RectRegionPacking(i):表示描述的是第i个子区域的划分信息。
proj_reg_width[i],proj_reg_height[i]:描述了采样后的图像中第i个子区域在源图像,即未采样前的经纬图(例如5中的(a))中的对应的宽高,例如描述的是图5中的(b)中的子区域在图5中的(a)中对应的宽高,比如对于宽x高,即3840x1920的经纬图,图5中的(b)中左上角第一个子区域在源图像中的宽和高是(1280,320)。
proj_reg_top[i],proj_reg_left[i]:描述了采样后的图像第i个子区域的左上角像素在源图像中的对应位置,例如描述的是图5中的(b)中的子区域左上点在图5中的(a)中对应的位置,比如图5中的(b)中左上角第一个子区域在源图像中的位置是(0,0)。该位置是以源图像的左上角为(0,0)坐标获取的。
transform_type[i]:描述了采样后的图像中第i个子区域从对应源图像位置经过的变换,比如第i个子区域是源图像中对应区域经过不变换/90度旋转/180度旋转/270度旋转/水平镜像/水平镜像后90度旋转/水平镜像后180度旋转/水平镜像后270度旋转获得的。
packed_reg_width[i],packed_reg_height[i]:描述了采样后的图像中第i个子区域在组合之后的规则图像中的宽和高,也就是图5中的(c)中的子区域的宽高。比如图5中的(b)中左上角第1个子区域在组合之后的规则图像中的宽和高为(320,320)。需要说明的是,当上述步骤406不执行时,子区域组合之后的图像为图5中的(b),则该宽高指的是在图像5中的(b)中的宽高。
packed_reg_top[i],packed_reg_left[i]:描述了采样后的图像中第i个子区域的左上角像素在子区域组合后的规则图像中的相对位置,也就是图5中的(c)中的各子区域的左上点位置。需要说明的是,当上述步骤406不执行时,子区域组合之后的图像为图5中的(b),则该位置指的是在图像5中的(b)中的位置。
方式二:在各个子区域的位置信息保存在各自对应的track中时,可以在tile track中描述相应的子区域的划分方式,具体可以在tile track中的moov box中增加下述语法:
aligned(8)class SubPictureCompositionBox extends TrackGroupTypeBox('spco'){
unsigned int(16)track_x;
unsigned int(16)track_y;
unsigned int(16)track_width;
unsigned int(16)track_height;
unsigned int(16)composition_width;
unsigned int(16)composition_height;
unsigned int(16)proj_tile_x;
unsigned int(16)proj_tile_y;
unsigned int(16)proj_tile_width;
unsigned int(16)proj_tile_height;
unsigned int(16)proj_width;
unsigned int(16)proj_height;
}
track_x,track_y:描述了当前track的子区域的左上角像素在子区域组合后的规则图像中的位置,也就是图5中的(c)中当前子区域的左上点位置。
track_width,track_height:描述了当前track的子区域在子区域组合后的规则图像中的宽和高,也就是图5中的(c)中当前子区域的宽和高;
composition_width,composition_height:描述了子区域组合后的规则图像的宽和高,也就是图5中的(c)中图像宽和高。
proj_tile_x,proj_tile_y:描述了当前track的子区域的左上角像素在源图像中的位置,也就是图5中的(a)中当前子区域的左上点位置。
proj_tile_width,proj_tile_height:描述了当前track的子区域在源图像中的宽高,也就是图5中的(a)中当前子区域的宽和高;
proj_width,proj_height:描述了源图像宽和高,也就是图5中的(a)中图像宽和高。
方式三、所有子区域的位置信息保存在MPD中,即在MPD中描述子区域的划分方式。
MPD中的语法可以为:
Figure PCTCN2018081177-appb-000001
Figure PCTCN2018081177-appb-000002
在该方式三的语法中,<value="0,1280,0,1280,320,3840,1920"/>的语义如下:第一个0表示源标识,有相同源标识的表示同源,即表示同一个源图像;“1280,0”表示当前表示中的子区域的左上位置在源图像中的坐标;“1280,320”表示当前表示中的子区域的宽和高;“3840,1920”表示源图像宽和高。
在上述MPD中是采用2D图像的方式描述子区域表示所对应的码流中的图像在源视频图像中的位置。可选的,子区域在源图像中的位置还可以采用球面坐标表示,比如将上述value中的信息转化为球面信息比如value=“0,0,30,0,120,30”,具体语义如下:第一个0表示源标识,源标识值相同的表示同源;“0,30,0”表示子区域对应的区域的中心点在球面上的坐标(偏航角,俯仰角,旋转角);“120,30”表示子区域的宽高角度。
方式四:所有子区域的位置信息保存在私有文件中,且私有文件的地址保存在MPD中。即通过MPD中指定文件链接的方式,将保存子区域划分的描述信息的私有文件的地址写入MPD中。
语法可以如下:
Figure PCTCN2018081177-appb-000003
在方式四中,将子区域的划分信息以私有文件tile_info.dat保存。该文件中保存的子区域划分信息的数据可由用户指定,这里不做限定,比如说保存的内容可以按如下的一种方式:
(文件<tile_info.dat>内容)
unsigned int(16)tile_num;
unsigned int(32)pic_width;
unsigned int(32)pic_height;
unsigned int(32)comp_width;
unsigned int(32)comp_height;
unsigned int(32)tile_pic_width[];
unsigned int(32)tile_pic_height[];
unsigned int(32)tile_comp_width[];
unsigned int(32)tile_comp_height[];
其中各数据表示的含义如下:
tile_num:表示划分的子区域个数。
pic_width:表示源图像宽度,即图5中的(a)中图像的宽度。
pic_height:表示源图像高度,即图5中的(a)中图像的高度。
comp_width:表示子区域组合后的规则图像的宽度,即图5中的(c)中图像的宽度。
comp_height:表示子区域组合后的规则图像的高度,即图5中的(c)中图像的高度。
tile_pic_width[]:数组,表示的是各个子区域在源图像中的宽度,元素个数应为tile_num值。
tile_pic_height[]:数组,表示的是各个子区域在源图像中的高度,元素个数应为tile_num值。
tile_comp_width[]:数组,表示的是各个子区域在子区域组合后的规则图像中的宽度,元素个数应为tile_num值。
tile_comp_height[]:数组,表示的是各个子区域在子区域组合后的规则图像中的高度,元素个数应为tile_num值。
方式四中通过指定新的EssentialProperty属性Tile@value,将私有文件的统一资源定位符(Uniform Resource Locator,URL)写入MPD中。Tile@value属性描述可以如表1。当终端进行视频内容请求时,通过解析该元素获取私有文件,从而获得子区域划分方式和位置等信息。
表1 在"urn:mpeg:dash:tile:2014"中Tile@value属性描述
Tile@value Description
information specifies information of tiles
方式五:各个子区域的位置信息保存在各个子区域的码流的辅助增强信息SEI中。也就是,通过将子区域的位置信息写入码流的SEI来传输子区域的划分方式。基于子区域在图像中的划分信息SEI语法元素中的一种设置可以如表2所示。
表2 基于子区域划分信息的SEI语法元素
总SEI语法
Figure PCTCN2018081177-appb-000004
子区域划分信息SEI语法
Figure PCTCN2018081177-appb-000005
表2中,针对SEI类型加入新类型155,表示当前码流为子区域码流,并加入信息tile_wise_mapping_info(payloadSize),包含的语法元素含义如下:
src_pic_width表示源图像宽,即图5(a)中图像宽度。
src_pic_height表示源图像高,即图5(a)中图像高度。
src_tile_x表示当前子区域的左上角在源图像上的横向坐标,即当前子区域在图5 中的(a)中的横坐标。
src_tile_y表示当前子区域的左上角在源图像上的纵向坐标,即当前子区域在图5中的(a)中的纵坐标。
src_tile_width表示当前子区域在源图像上的宽度。
src_tile_height表示当前子区域在源图像上的高度。
packed_pic_width表示子区域组合后的规则图像的宽,即图5(c)中图像宽度。
packed_pic_height表示子区域组合后的规则图像的高,即图5(c)中图像高度。
packed_tile_x表示当前子区域的左上角在组合后的规则图像上的横向坐标,即当前子区域在图5(c)中的横坐标。
packed_tile_y表示当前子区域的左上角在组合后的规则图像上的纵向坐标,即当前子区域在图5(c)中的纵坐标。
packed_tile_width表示当前子区域在组合后的规则图像上的宽度。
packed_tile_height表示当前子区域在组合后的规则图像上的高度。
此外,本申请还可以对上述方式四进行扩展,在MPD中也可以通过新的元素指定保存子区域的位置信息的私有文件的URL。
扩展方式四:通过MPD中指定文件链接的方式,将保存子区域划分信息的私有文件地址写入MPD中。语法可以为:
Figure PCTCN2018081177-appb-000006
在扩展方式四中,将子区域的位置信息以私有文件tile_info.dat保存,添加语法元素<UserdataList>(见表3),包含UserdataURL元素,将该私有文件写入MPD中。当终端进行视频内容请求时,通过解析<UserdataList>获取该私有文件,来获得子区域的划分方式和位置等信息。
表3语法元素ExtentdataList描述
Figure PCTCN2018081177-appb-000007
对于上述方式四中进行子区域划分方式的描述信息还可以进行扩展,该扩展针对传输的私有文件tile_info.dat中的内容,添加关于用户视角与所需子区域的关系表,使终端能够更快地请求对应的子区域码流。即私有文件还可以包括用于表征用户视点与用户视点的视角覆盖的子区域的编号的对应关系的信息。
在本例中,针对私有文件tile_info.dat,关于子区域划分的信息内容不变,添加关于用户视角与所需子区域的关系表,及用户视点与用户视点的视角覆盖的子区域的编号的对应关系。比如,保存的内容可以按如下的一种方式:
(文件<tile_info.dat>内容)
unsigned int(16)tile_num;
unsigned int(32)pic_width;
unsigned int(32)pic_height;
unsigned int(32)comp_width;
unsigned int(32)comp_height;
unsigned int(32)tile_pic_width[];
unsigned int(32)tile_pic_height[];
unsigned int(32)tile_comp_width[];
unsigned int(32)tile_comp_height[];
unsigned int(16)deg_step_latitude;
unsigned int(16)deg_step_longitude;
unsigned int(32)view_tile_num;
unsigned int(16)viewport_table[][];
其中,相对于方式四中新增的数据为deg_step_latitude,deg_step_longitude,view_tile_num,viewport_table[][],数据表示的含义如下:
deg_step_latitude:纬度方向上划分的视点区域步长,该步长将纬度范围-90°至90°划分为多个视点区域。视点区域指的是某个视点在经纬图上所属的区域范围,在相同视点区域内,终端获得的覆盖该视点区域的图像子区域码流是相同的。如图8所示,整个经纬图被划分为9个视点区域,视点1和视点2均属于第5个视点区域,且图8中标出了视点区域5中心的视点。在视点区域5范围内的所有视点,对应的视角覆盖范围都将计算为中心视点对应视角所覆盖的范围。
deg_step_longitude:经度方向上划分的视点区域步长,该步长将经度范围0°至360° 划分为多个视点区域。deg_step_latitude与deg_step_longitude共同决定了视点区域个数。
view_tile_num:单个视角在变化时,所能覆盖的最大子区域个数。
viewport_table[][]:数组,用于保存某个视点区域与所覆盖该视点区域的图像子区域编号之间的关系表,表中数据总个数应为视点区域个数乘以view_tile_num。
关于数据表viewport_table[][],一种示例保存方式如下所示:
viewport_table[100][18]={
1,2,3,4,5,6,7,8,9,10,13,16,19,0,0,0,0,0,
1,2,3,4,5,6,7,8,9,12,15,18,21,0,0,0,0,0,
5,6,7,13,14,15,16,25,26,27,28,35,36,37,0,0,0,0,
}
该表视点区域个数为100,view_tile_num=18。数据表中每一行的18个数代表某个视点的视角对应覆盖的子区域编号,编号为0表示该视角只需要少于18个子区域就能覆盖,空余值被填0,比如图9所示视点位于纬度0°经度150°的视角,对应覆盖的子区域为编号5,6,7,13,14,15,16,25,26,27,28,35,36,37的子区域,在数据表中的值则表示为5,6,7,13,14,15,16,25,26,27,28,35,36,37,0,0,0,0。这样,终端获得这些值之后,只需要根据当前视点找到对应表中的子区域编号,不需要进行计算即可直接根据对应关系请求这些编号对应的子区域码流进行解码呈现,加快了终端的处理速度。
基于上述私有文件包括用户视点与用户视点的视角覆盖的子区域的编号的对应关系,本申请实施例还可以在上述私有文件tile_info.dat中加入针对视角的优化呈现的标记数据;相对应的,数据表viewport_table[][]中数据的排列可以以优化的形式出现,即越靠近当前视点的子区域,其子区域的编号出现在当前视点对应行靠前位置。
在本例中,私有文件还包括用于表征用户视角覆盖的子区域中需优先显示的子区域个数的信息、需优先显示的子区域编号的信息、次优先显示的子区域编号的信息以及不显示的子区域编号的信息。针对私有文件tile_info.dat,保存的内容可以按如下的一种方式:
(文件<tile_info.dat>内容)
unsigned int(16)tile_num;
unsigned int(32)pic_width;
unsigned int(32)pic_height;
unsigned int(32)comp_width;
unsigned int(32)comp_height;
unsigned int(32)tile_pic_width[];
unsigned int(32)tile_pic_height[];
unsigned int(32)tile_comp_width[];
unsigned int(32)tile_comp_height[];
unsigned int(16)deg_step_latitude;
unsigned int(16)deg_step_longitude;
unsigned int(32)view_tile_num;
unsigned int(16)priority_view_tile_num;
unsigned int(16)viewport_table[][];
其中,相对于新增的数据为priority_view_tile_num,数据表示的含义为:当前视点下需优先显示的子区域个数。对应的,viewport_table[][]表中数据排列进行修改,将靠近当前视点的子区域放置在当前视点对应行的前面,如下所示:
viewport_table[100][18]={
1,2,3,4,5,6,7,8,9,10,13,16,19,0,0,0,0,0,
1,2,3,4,5,6,7,8,9,12,15,18,21,0,0,0,0,0,
14,15,26,27,13,6,16,28,36,25,5,7,35,37,0,0,0,0,
}
如该表所示,对应于图9所示视点位于纬度0°经度150°的视角,将其在表中的数据更改为14,15,26,27,13,6,16,28,36,25,5,7,35,37,0,0,0,0,离视点较近的子区域编号14,15,26,27放在前面,较远的子区域编号13,6,16,28,36,25放在中间,最远的子区域编号5,7,35,37放置最后。离视点近的子区域优先显示,离视点较远的子区域不作优先显示,可以次优先显示。这样做的好处在于,当存在某些原因(比如网络不稳定)导致所有的子区域码流无法完全获取或不需要完全获取时,可优先获取离视点近的子区域进行优先显示,而摒弃不作优先显示的图像子区域数据。
在上述服务器经过编码前处理、编码以及封装的过程之后,终端可以获取封装后的码流进行解码显示。因此,如图10所示,本申请实施例方法还可以包括:
101、终端确定全景图像各个子区域的位置信息。
一种可能的实现方式中,终端可以接收服务器发送的第一信息,该第一信息包括全景图像的各个子区域的track以及各个子区域的码流,track包括全景图像的所有子区域的位置信息;终端根据track解析得到全景图像中各个子区域的位置信息。该track可以是上述方式一中的拼接后的图像的track,终端可以通过解析拼接后的图像的track中的RectRegionPacking(i)中定义的语法解析出所有子区域的位置信息。
或者,对于子区域的位置信息来说,终端可以按照上述对于保存子区域的位置信息的方式二,各个子区域的位置信息保存在各自子区域对应的轨迹,即tile track中,终端可以通过解析每个tile track中的SubPictureCompositionBox中定义的区域解析出当前子区域的位置信息。
或者,终端可以通过接收服务器发送的MPD,MPD包括各个子区域的位置信息,或MPD中包括私有文件的地址,且私有文件包括各个子区域的位置信息,终端解析MPD,以获取各个子区域的位置信息。
或者,终端可以在先获取到各个子区域对应的码流的情况下,子区域的位置信息存在与子区域对应的SEI中。也就是说,终端在请求所需的子区域的码流时,就可以根据该码 流中的SEI获取子区域的位置信息。
102、终端根据确定的各个子区域的位置信息,确定当前视角覆盖的子区域在全景图像中的位置信息。
例如,终端就可以根据视角和视角覆盖的子区域的位置信息的匹配关系,获取当前视角覆盖的子区域在全景图像中的位置信息。
103、终端确定子区域的第一采样间隔。
终端可以确定预设的采样间隔为第一采样间隔,或者,终端从服务器接收第一采样间隔,或者,终端可以根据从服务器接收到的各个子区域的位置信息获取第一采样间隔,即,各个子区域的位置信息与第一采样间隔可以存在预设的计算规则,以获取每个子区域对应的第一采样间隔。该计算规则可以是源图像中该子区域的位置信息中的尺寸大小与拼接后的图像中的子区域的位置信息中的尺寸大小的比例,即为第一采样间隔。
104、终端根据确定的当前视角覆盖的子区域位置信息,获取当前视角覆盖的子区域对应的码流。
如果所有子区域的码流保存在终端本地,终端可以从终端的存储器中直接获取当前视角覆盖的子区域的码流。
或者,终端向服务器请求获取当前视角覆盖的子区域对应的码流。示例性的,终端可以将指示当前视角的信息发送给服务器,服务器可以根据当前视角和当前视角能够覆盖的子区域的位置信息获取当前视角覆盖的子区域,而后将终端所需的当前视角覆盖的子区域对应的码流发送给终端,例如服务器可以将要传输的子区域码流拼接后的码流发送给终端。或者,终端可以根据当前视角和当前视角覆盖的子区域的位置信息获取当前视角覆盖的子区域后,可以将当前视角覆盖的子区域的编号发送给服务器,服务器可以根据编号将终端所需的子区域的码流发送给终端。或者,终端可以按照终端与服务器预设的协议,从服务器获取当前视角覆盖的子区域对应的码流,协议包括视角与该视角覆盖的子区域的对应关系。本申请对于终端获取所需码流的方式不做限定。
105、终端对码流进行解码,以得到当前视角覆盖的子区域的图像。
由于服务器对经纬图进行了横向和纵向划分以及纵向进行下采样处理,即对子区域中的像素进行了去冗余度处理,使得要传输的子区域的像素冗余度降低,像素值减少,那么对于解码端终端来说,在获取到当前视角覆盖的子区域的码流时,可以使得对该解码能力要求降低,解码的复杂度下降,从而使得解码速度得到提升。
106、终端根据确定的当前视角覆盖的子区域的位置信息以及第一采样间隔对解码后的图像进行重采样。
107、终端对重采样后的图像进行播放。
如图11所示,假设用户请求需显示的视角对应的子区域为图11中的(d)所示,根据计算获得的所需子区域如图11中的(b)所示,终端可以根据子区域的编号与码流的对应关系获取所需子区域对应的码流,包括编号为1、3、4、5、6、15、19、20、21、22、23、24、34、35、36以及37的子码流,如图11中的(c)所示,进而,终端对这些子码流解码后,可以根据位置信息以及第一采样间隔对解码后的图像进行重 采样,而后对重采样后的图像进行播放,如图11中的(d)。
上述编码前的处理、编码以及终端部分均是以2D经纬图为例进行说明的,本申请实施例还可以用于3D经纬图进行编码传输的过程,3D经纬图序列的两路信号可以分别进行处理。可以理解的是,要呈现3D视觉效果的情况下,与服务器进行通信的拍摄设备可以包括两组,一组拍摄设备用于获取左眼的全景视频,另一组拍摄设备用于获取右眼的全景视频。这样,3D经纬图的子区域的划分可以如图12所示。左眼的经纬图为图12上半部分,右眼的经纬图为图12下半部分。左眼对应的经纬图可以和右眼对应的经纬图拼接在一起,为一张经纬图,也可以是分开的,为两张经纬图。服务器可以将左眼对应的经纬图与右眼对应的经纬图进行分割,以便对左眼对应的经纬图进行横向划分和纵向划分,以及对右眼对应的经纬图进行横向划分和纵向划分。
其中,3D经纬图左眼经纬图的横向划分可以参考步骤401的实现方式,右眼经纬图的横向划分也可以参考步骤401的实现方式,在此不赘述。
3D经纬图左眼经纬图的横向纵向划分可以参考步骤402的实现方式,右眼经纬图的纵向划分也可以参考步骤402的实现方式,在此不赘述。
3D经纬图左眼经纬图的各个子区域的采样可以参考步骤404至405的实现方式,右眼经纬图的各个子区域的采样也可以参考步骤404至405的实现方式,在此不赘述。
于是,最终获得的左眼经纬图中42个子区域,以及右眼经纬图中42个子区域,共84个子区域。
对于3D经纬图划分和采样后的图像中每个子区域的编码方式可以由多种,这里列举三种可能的方式。第一种,将每个子区域作为一个子图像,从原图像中划分开,并对每个子图像序列进行独立编码,生成84个子码流。第二种:将整幅图像进行分子区域模式编码(HEVC标准支持),生成单个码流进行保存,或者将该单个码流进行切割后获得84个子码流进行保存。第三种:将左眼经纬图和右眼经纬图在相同位置对应部分的子区域作为一组子区域,对二者进行图像拼接后进行独立编码,生成42个子码流。
封装过程可以参考上述方式一至方式五,此处不再赘述。
针对3D经纬图视频,终端对视频内容的解码过程与上述对于2D经纬图不同的是:这里的当前视角覆盖的子区域的位置信息包括左眼图像子区域和右眼图像子区域位置信息。
当前视角覆盖的子区域的码流包括左眼经纬图中子区域的码流和右眼经纬图中子区域的码流。当前视角的值可以取左眼的视角点值,也可以取右眼的视角点值,在此不做限制。重采样时是针对当前视角左眼覆盖的子区域的图像进行重采样,以及当前视角右眼覆盖的子区域的图像进行重采样,并对所需左眼子区域和右眼子区域进行渲染和显示。
上述方法过程还可以应用于360度全景视频的经纬图,此外,该经纬图还可以为360度全景视频图像的经纬图的一部分,例如该经纬图的划分方式还可以应用于180°半全景视频的图像的经纬图的划分。180°半全景视频是指经度范围180°包含全景视频一半内容的全景视频。
如图13中的(a)所示,对于180°半全景视频的经纬图横向划分的方式可以参考 上述步骤401,对于纵向划分,与上述步骤402的可实现的方式不同的可以是,对于纬度范围-90°至-60°以及60°至90°之间的子图像,可以不进行纵向划分,保持单个子区域;对于纬度范围-60°至-30°以及30°至60°之间的子图像,以经度60°为纵向划分间隔,纵向划分该子图像获得3个子区域;对于纬度范围-30°至0°以及0°至30°之间的子图像,以经度30°为纵向划分间隔,纵向划分该子图像获得6个子区域。这样,对整个180°半全景视频的经纬图的子区域划分完成,共获得了20个子区域。
对于180°半全景视频的经纬图的子区域也可以进行下采样后再进行编码,与上述步骤404的实现方式可以相同,与上述步骤405的实现方式不同的可以是,以图13中的(a)为例,对于纬度范围-90°至-60°以及60°至90°之间的子图像,纵向不变,横向进行下采样,缩放系数为1/6;对于纬度范围-60°至-30°以及30°至60°之间的子图像,同样地,纵向不变,横向进行下采样,缩放系数为1/2;对于纬度范围-30°至0°以及0°至30°之间的子图像,不进行缩放。最终得到的缩放图像可以如图13中的(b)图所示。
上述对于180°半全景视频的经纬图的子区域划分方式,也可以应用于3D180°半全景视频的经纬图的子区域划分,与360°全景视频同理,3D180°半全景视频的经纬图也包括左眼的180°半全景视频的经纬图和右眼的180°半全景视频的经纬图。左眼的经纬图和右眼的经纬图可以拼接在一起,如图14所示,左眼的经纬图为图14的左半部分,右眼的经纬图为图14的右半部分,服务器可以先将左眼的经纬图和右眼的经纬图进行分割,如图14中的虚线所示。而后,将左眼的经纬图按照与180°半全景视频的经纬图的划分方式进行划分,并将右眼的经纬图同样按照与180°半全景视频的经纬图的划分方式进行划分,最终获得左眼的经纬图对应的20个子区域,右眼的经纬图对应的20个子区域,共40个子区域。
上述过程中,服务器可以根据拍摄设备拍摄的视频信号获取的是全景视频或半全景视频对应的经纬图,本申请实施例中的服务器还可以提供一种直接在球面全景信号中进行划分并获得图像子区域的方法。由于源图像是球面信号图,或者称为球面图,在码流封装方式,子区域划分方式也有所改变。在本实施例中,球面区域以经纬度的方式指定信号位置,约定经度范围为0至360°,纬度范围为-90°至90°(负数度数表示南纬,正数度数表示北纬)。
因此,本申请实施例提供一种图像的处理方法,如图15所示,包括:
1501、服务器对待处理图像的球面图进行横向划分,横向划分的划分位置为预先设定的纬度。
示例性的,服务器可以分别在球面上纬度为-60°,-30°,0°,30°以及60°处划纬度线,横向切分球面图。如图16中的(a)所示。
1502、服务器对待处理图像球面图进行纵向划分,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离,得到经纬图的各个子区域。
例如,可以在球面图上对于纬度范围-90°至-60°以及60°至90°之间的球面区域,以经度120°为纵向划分间隔,以经度线纵向划分球面图获得3个球面子区域;对于纬度范围-60°至-30°以及30°至60°之间的球面区域,以经度60°为纵向划分间隔,以经度 线纵向划分球面获得6个球面子区域;对于纬度范围-30°至0°以及0°至30°之间的球面区域,以经度30°为纵向划分间隔,纵向划分球面图获得12个球面子区域。这样,对整个球面图的子区域划分完成,共获得了42个子区域,如图16中的(a)图所示。
1503、服务器对各个子区域的图像进行采样。
服务器可以先将子区域的图像按照预设尺寸映射到二维平面图像,这样就可以按照上述对经纬图的各个子区域按照第一采样间隔和第二采样间隔进行采样。
从三维的球面图映射到二维的经纬图的实现方式可以为:对球面图划分后的子区域的图像,在纵向按照预设尺寸高度进行均匀采样,在横向按照预设尺寸宽度进行均匀采样。而后,就可以对均匀采样后各个子区域的图像按照第一采样间隔在横向进行采样,纵向保持按照第二采样间隔进行采样。
示例性的,在将球面上所有子区域对应16图中的(a)的子区域进行图像信号映射,使得球面图上的每个子区域对应映射图像即二维的经纬图中的一个子区域,并对经纬图进行下采样。从球面信号到子区域图像的映射方法很多,这里不做限制,其中一种方式可以为:纬度方向上,对于各个球面子区域,球面信号按照图16中的(b)子区域图像预设尺寸高度进行均匀映射,均匀映射可以理解为均匀采样。经度方向上,对于纬度范围-90°至-60°以及60°至90°之间的子球面区域,球面信号按照以纬度方向采样率的1/4进行下采样映射,即缩放系数为1/4;对于纬度范围-60°至-30°以及30°至60°之间的子球面区域,球面信号按照以纬度方向采样率的1/2进行下采样映射,即缩放系数为1/2;对于纬度范围-30°至0°以及0°至30°之间的子球面区域,球面信号按照以纬度方向的相同采样率进行映射,即缩放系数为1。最终得到的采样后的经纬图的图像如图16中的(b)图所示。
1504、服务器调整采样后的各个子区域的位置,使得调整后的各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。例如如图16中的(c)所示。步骤1504也可以选择不执行。
1505、服务器对拼接成的图像分Tile进行编码。
步骤1505的实现方式可以参见上述步骤407,这里不再赘述。
对于球面图的图像处理方法中,对各个子区域码流的封装方式可以与上述步骤408的方式相同,对子区域的位置信息的各种保存方式也可以相同,不同的是,当子区域由对待处理图像的球面图进行横向划分和纵向划分得到时,采样后的各个子区域形成采样后的球面图,位置信息包括子区域在球面图的图像中的位置以及经纬度范围,以及子区域在采样后的球面图的图像中的位置以及尺寸大小;或者,位置信息包括子区域在球面图图像中的位置以及经纬度范围,以及子区域在拼接成的图像中的位置以及尺寸大小。针对上述子区域划分方式描述进行变量语义修改如下:
方式一中修改如下语义:
proj_reg_width[i],proj_reg_height[i]:描述了第i个子区域在源图像,即球面图中的对应经纬度范围,也就是图16中的(b)中的子区域在图16中的(a)中对应的经纬度范围,比如图16中的(b)左上角第一个子区域在源图像中的经纬度范围是(120°,30°)。
proj_reg_top[i],proj_reg_left[i]:描述了第i个子区域的左上角像素在球面图中的 对应位置,以经纬度表示,也就是图16中的(b)中的子区域左上点在图16中的(a)中对应的位置,比如上述第一个子区域在球面图中的位置是(0°,90°)。
方式二中修改如下语义:
proj_tile_width,proj_tile_height:描述了当前track的子区域在球面图中的经纬度范围,也就是图16中的(a)中当前子区域的经纬度范围;
proj_width,proj_height:描述了球面图经纬度范围,比如360°全景球面经纬度范围为(360°,180°)。
方式四中对于私有文件tile_info.dat内容,修改如下语义:
pic_width:表示球面图经度范围。
pic_height:表示球面图纬度范围。
tile_pic_width[]:数组,表示的是各个子区域在球面图中的经度范围,元素个数应为tile_num值。
tile_pic_height[]:数组,表示的是各个子区域在球面图中的纬度范围,元素个数应为tile_num值。
方式五中修改如下语义:
src_pic_width表示球面图经度范围,即图16中的(a)中球面图经度范围。
src_pic_height表示球面图纬度范围,即图16中的(a)中球面图像纬度范围。
src_tile_width表示当前子区域在球面图上的经度范围。
src_tile_height表示当前子区域在球面图上的纬度范围。
这样,本申请相对于对经纬图均匀划分的方式,这种非均匀的划分方式以及对图像进行缩放的方式,图像冗余度降低,可以使得tile-wise编码传输效率有较大提升。同时,也降低了终端解码器所需的最大解码能力,可以使得更高分辨率的源图像在现有解码能力下进行编码传输呈现成为可能。以均匀划分6×3为例,需要传输的像素占比最高达到55.6%,如果源图像的分辨率为4K(4096×2048),则解码器能力需达到约4K×1K。而使用本申请所述方法,传输的像素占比最高可以为25%,如果源图像的分辨率为4K(4096x2048),则解码器解码能力需2Kx1K。并且,该性能提高了解码与播放的速度,本申请方案相对于均匀划分方案解码播放处理效率更高。
本申请实施例还提供一种图像的处理方法,应用于服务器,如图17A,包括:
17A1、服务器保存全景图像的经纬图或球面图的各个子区域的图像对应的码流,子区域根据对全景图像的经纬图或球面图进行横向划分和纵向划分得到,其中,横向划分的划分位置为预先设定的纬度,纵向划分的划分位置由纬度确定,相邻的横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,纵向划分间隔为相邻的纵向划分的划分位置间的距离。
17A2、服务器向终端发送终端请求的保存的各个子区域的图像对应的码流中当前视角覆盖的子区域的码流。
其中,保存于服务器的子区域对应的图像在被编码之前,被按照第一采样间隔在横向进行采样;其中,子区域所对应的纬度越高,第一采样间隔越大;或者,被按照第二采样间隔在纵向进行采样。采样的具体实现方式可以参照上述实施例中的说明。
也就是说,本实施例中的服务器可以保存以上实施例中服务器对图像处理后的各 个子区域的图像对应的码流,由于上述实施例中服务器对图像的处理采用的子区域划分方式以及采样过程,可以使得传输码流时的占用的带宽减小,解码端对于解码能力的要求降低,解码复杂度下降,解码速度得到提升,本实施例中的服务器在传输码流时占用的带宽相对现有技术会减小,终端的解码速度得到提升。
上述主要从各个网元之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个网元,例如服务器、终端等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对服务器和终端等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图17示出了上述实施例中所涉及的服务器的一种可能的结构示意图,服务器17包括:划分单元1701,编码单元1702,采样单元1703、拼接单元1704、封装单元1705以及传输单元1706。划分单元1701可以用于支持服务器执行图4中的过程401、402,编码单元1702可以用于支持服务器执行图4中的过程403,图7中的过程407。采样单元1703可以用于支持服务器执行图7中的过程404、405,拼接单元1704用于支持服务器执行图7中的过程406,封装单元1705可以用于支持服务器执行图7中的过程408。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用集成的单元的情况下,图18示出了上述实施例中所涉及的服务器的一种可能的结构示意图。服务器18包括:处理模块1802和通信模块1803。处理模块1802用于对服务器的动作进行控制管理,例如,处理模块1802用于支持服务器执行图4中的过程401、402,403、404、405、406、407以及408,和/或用于本文所描述的技术的其它过程。通信模块1803用于支持服务器与其他网络实体的通信,例如与终端之间的通信。服务器还可以包括存储模块1801,用于存储服务器的程序代码和数据。
其中,处理模块1802可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),专用集成电路(Application-Specific Integrated Circuit,ASIC),现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块13803可以是收发器、收发电路或通信接口等。存储模块1801可以是存储器。
当处理模块1802为处理器,通信模块1803为收发器,存储模块1801为存储器时, 本申请实施例所涉及的服务器可以为图19所示的服务器。
参阅图19所示,该服务器19包括:处理器1912、收发器1913、存储器1911以及总线1914。其中,收发器1913、处理器1912以及存储器1911通过总线1914相互连接;总线1914可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图19中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在采用对应各个功能划分各个功能模块的情况下,图20示出了上述实施例中所涉及的终端的一种可能的结构示意图,终端20包括:获取单元2001,解码单元2002,重采样单元2003以及播放单元2004。获取单元2001用于支持终端执行图10中的过程101、102、103、104,解码单元2002用于支持终端执行图10中的过程105,重采样单元2003用于支持终端执行图10中的过程106,播放单元2004用于支持终端执行图10中的过程107。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用集成的单元的情况下,图21示出了上述实施例中所涉及的终端的一种可能的结构示意图。终端21包括:处理模块2102和通信模块2103。处理模块2102用于对终端的动作进行控制管理,例如,处理模块2102用于支持终端执行图10中的过程101-106,和/或用于本文所描述的技术的其它过程。通信模块2103用于支持终端与其他网络实体的通信,例如与服务器之间的通信。终端还可以包括存储模块2101,用于存储终端的程序代码和数据,还包括显示模块2104,用于支持终端执行图10中的过程107。
其中,处理模块2102可以是处理器或控制器,例如可以是CPU,通用处理器,DSP,ASIC,FPGA或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块2103可以是收发器、收发电路或通信接口等。存储模块2101可以是存储器。显示模块2104可以是显示器等。
当处理模块2102为处理器,通信模块2103为收发器,存储模块2101为存储器,显示模块2104为显示器时,本申请实施例所涉及的终端可以为图22所示的终端。
参阅图22所示,该终端22包括:处理器2212、收发器2213、存储器2211、显示器2215以及总线2214。其中,收发器2213、处理器2212、显示器2215以及存储器2211通过总线2214相互连接;总线2214可以是PCI总线或EISA总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图22中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在采用对应各个功能划分各个功能模块的情况下,图23示出了上述实施例中所涉及的服务器的一种可能的结构示意图,服务器23包括:存储单元2301,传输单元2302,存储单元2301用于支持服务器执行图17A中的过程17A1,传输单元2302用于支持服务器执行图17A中的过程17A2。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用集成的单元的情况下,图24示出了上述实施例中所涉及的服务器的一种可能的结构示意图。服务器24包括:存储模块2402和通信模块2403。存储模块2402,用于存储服务器的程序代码和数据,例如该程序用于执行图17A中的过程17A1,通信模块2403用于执行图17A中的过程17A2。
当通信模块2403为收发器,存储模块2401为存储器时,本申请实施例所涉及的服务器可以为图25所示的终端。
参阅图25所示,该服务器25包括:收发器2511、存储器2512以及总线2513。其中,收发器2511、存储器2512通过总线2513相互连接;总线2513可以是PCI总线或EISA总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图25中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
结合本申请公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(Random Access Memory,RAM)、闪存、只读存储器(Read Only Memory,ROM)、可擦除可编程只读存储器(Erasable Programmable ROM,EPROM)、电可擦可编程只读存储器(Electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于核心网接口设备中。当然,处理器和存储介质也可以作为分立组件存在于核心网接口设备中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上所述的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本申请的保护范围之内。

Claims (46)

  1. 一种图像的处理方法,应用于服务器,其特征在于,包括:
    对待处理图像的经纬图或球面图进行横向划分和纵向划分,以得到所述经纬图或所述球面图的各个子区域,其中,所述横向划分的划分位置为预先设定的纬度,所述纵向划分的划分位置由纬度确定,相邻的所述横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,所述纵向划分间隔为相邻的所述纵向划分的划分位置间的距离;
    对所述得到的各个子区域的图像进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述纵向划分的划分位置由纬度确定包括:所述纵向划分的划分位置所处的纬度越高,所述纵向划分间隔越大。
  3. 根据权利要求1或2所述的方法,其特征在于,在所述对所述得到的各个子区域的图像进行编码之前,所述方法还包括:
    对所述子区域的图像按照第一采样间隔在横向进行采样;其中,所述子区域所对应的纬度越高,所述第一采样间隔越大;
    所述对所述得到的各个子区域的图像进行编码,包括:
    对所述采样后的各个子区域的图像进行编码。
  4. 根据权利要求3所述的方法,其特征在于,在对所述得到的各个子区域的图像进行编码之前,所述方法还包括:
    对所述子区域的图像按照第二采样间隔在纵向进行采样。
  5. 根据权利要求3或4所述的方法,其特征在于,当所述子区域由对所述待处理图像的球面图进行横向划分和纵向划分得到时,在所述对所述子区域的图像按照第一采样间隔在横向进行采样之前,所述方法还包括:
    将所述子区域的图像按照预设尺寸映射为二维平面图像;
    所述对所述子区域的图像按照第一采样间隔在横向进行采样,包括:
    对所述子区域的图像映射的二维平面图像按照所述第一采样间隔在横向进行采样。
  6. 根据权利要求3至5任一项所述的方法,其特征在于,在所述对所述采样后的各个子区域的图像进行编码之前,所述方法还包括:
    调整所述采样后的各个子区域的位置,使得所述调整后的各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。
  7. 根据权利要求6所述的方法,其特征在于,所述对所述采样后的各个子区域的图像进行编码包括:
    对所述拼接成的图像分片(Tile)进行编码。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,在所述对所述得到的各个子区域的图像进行编码之后,所述方法还包括:
    对通过所述编码获得的各个子区域的图像对应的码流进行独立封装,并且编码所述各个子区域的位置信息;其中,所述编码后的所有所述子区域的位置信息与所有所述子区域的码流存在于一个轨迹中;或者,所述编码后的每个所述子区域的位置信息与码流存在于各自的轨迹中;或者,所述编码后的所有所述子区域的位置信息存在于媒体呈现描述(MPD);或者,所述编码后的所有所述子区域的位置信息存在于私有 文件中,且所述私有文件的地址存在于MPD中;或者,所述编码后的每个所述子区域的位置信息存在于所述每个子区域的码流的辅助增强信息(SEI)中。
  9. 根据权利要求8所述的方法,其特征在于,当所述子区域由对所述待处理图像的经纬图进行横向划分和纵向划分得到时,所述采样后的各个子区域形成采样后的经纬图,所述位置信息包括所述子区域在所述经纬图中的位置以及尺寸大小,以及所述子区域在所述采样后的经纬图中的位置以及尺寸大小;或者,所述位置信息包括所述子区域在所述经纬图中的位置以及尺寸大小,以及所述子区域在所述拼接成的图像中的位置以及尺寸大小;或者,当所述子区域由对所述待处理图像的球面图进行横向划分和纵向划分得到时,所述采样后的各个子区域形成采样后的球面图,所述位置信息包括所述子区域在所述球面图的图像中的位置以及经纬度范围,以及所述子区域在所述采样后的球面图的图像中的位置以及尺寸大小;或者,所述位置信息包括所述子区域在所述球面图图像中的位置以及经纬度范围,以及所述子区域在所述拼接成的图像中的位置以及尺寸大小。
  10. 根据权利要求8或9所述的方法,其特征在于,所述私有文件还包括用于表征用户视点与所述用户视点的视角覆盖的子区域的编号的对应关系的信息。
  11. 根据权利要求8至10任一项所述的方法,其特征在于,所述私有文件还包括用于表征所述用户视角覆盖的子区域中需优先显示的子区域个数的信息、所述需优先显示的子区域编号的信息、次优先显示的子区域编号的信息以及不显示的子区域编号的信息。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,所述经纬图包括左眼对应的经纬图和右眼对应的经纬图;
    在所述对待处理图像的经纬图或球面图进行横向划分和纵向划分之前,所述方法还包括:
    将所述左眼对应的经纬图与所述右眼对应的经纬图进行分割;
    所述对待处理图像的经纬图或球面图进行横向划分和纵向划分,包括:
    对所述左眼对应的经纬图进行所述横向划分和所述纵向划分,以及对所述右眼对应的经纬图进行所述横向划分和所述纵向划分。
  13. 根据权利要求9至12任一项所述的方法,其特征在于,所述方法还包括:
    将所述通过所述编码获得的各个子区域的图像对应的码流发送给终端;
    或者,接收终端发送的视角信息,根据所述视角信息获取所述视角信息对应的子区域,将所述视角信息对应的子区域的码流发送给所述终端;
    或者,接收终端发送的子区域的编号,将所述子区域的编号对应的码流发送给所述终端。
  14. 根据权利要求1-13任一项所述的方法,其特征在于,
    所述经纬图为360度全景视频图像的经纬图,或所述360度全景视频图像的经纬图的一部分;或
    所述球面图为360度全景视频图像的球面图,或所述360度全景视频图像的球面图的一部分。
  15. 一种图像的处理方法,应用于终端,其特征在于,包括:
    确定全景图像各个子区域的位置信息;
    根据所述确定的各个子区域的位置信息,确定当前视角覆盖的子区域在所述全景图像中的位置信息;
    确定子区域的第一采样间隔;
    根据所述确定的当前视角覆盖的子区域位置信息,获取所述当前视角覆盖的子区域对应的码流;
    对所述码流进行解码,以得到所述当前视角覆盖的子区域的图像;
    根据所述确定的当前视角覆盖的子区域的位置信息以及所述第一采样间隔对所述解码后的图像进行重采样,并对所述重采样后的图像进行播放。
  16. 根据权利要求15所述的方法,其特征在于,所述确定全景图像各个子区域的位置信息,包括:
    接收服务器发送的第一信息,所述第一信息包括所述全景图像的各个子区域的轨迹以及所述各个子区域的码流,所述轨迹包括所述全景图像的所有子区域的位置信息;
    根据所述轨迹,得到所述全景图像中的各个子区域的位置信息。
  17. 根据权利要求15所述的方法,其特征在于,所述确定全景图像中各个子区域的位置信息,包括:
    接收服务器发送的媒体呈现描述(MPD),其中,所述MPD包括所述各个子区域的位置信息,或者,所述MPD中包括私有文件的地址,且所述私有文件包括所述各个子区域的位置信息;
    解析所述MPD,以获取所述各个子区域的位置信息。
  18. 根据权利要求15所述的方法,其特征在于,所述子区域的位置信息存在于所述子区域对应的码流的辅助增强信息(SEI)中。
  19. 根据权利要求15-18任一项所述的方法,其特征在于,所述获取所述当前视角覆盖的子区域对应的码流,包括:
    从所述终端的存储器中获取所述当前视角覆盖的子区域对应的码流;
    或者,向服务器请求获取所述当前视角覆盖的子区域对应的码流。
  20. 根据权利要求19所述的方法,其特征在于,所述向服务器请求获取所述当前视角覆盖的子区域对应的码流,包括:
    将指示所述当前视角的信息发送给所述服务器,接收所述服务器发送的所述当前视角覆盖的子区域对应的码流;
    或者,按照所述终端与所述服务器预设的协议,从所述服务器获取所述当前视角覆盖的子区域对应的码流,所述协议包括视角与该视角覆盖的子区域的对应关系。
  21. 根据权利要求16至20任一项所述的方法,其特征在于,所述确定子区域的第一采样间隔,包括:
    确定预设的采样间隔为所述第一采样间隔;
    或者,从所述服务器接收所述第一采样间隔;
    或者,根据从所述服务器接收到的所述各个子区域的位置信息获取所述第一采样间隔。
  22. 一种图像的处理方法,应用于服务器,其特征在于,包括:
    保存全景图像的经纬图或球面图的各个子区域的图像对应的码流,所述子区域根据对所述全景图像的经纬图或球面图进行横向划分和纵向划分得到,其中,所述横向划分的划分位置为预先设定的纬度,所述纵向划分的划分位置由纬度确定,相邻的所述横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,所述纵向划分间隔为相邻的所述纵向划分的划分位置间的距离;
    向终端发送所述终端请求的所述保存的各个子区域的图像对应的码流中当前视角覆盖的子区域的码流。
  23. 根据权利要求22所述的方法,其特征在于,所述保存于所述服务器的子区域对应的图像在被编码之前,被按照第一采样间隔在横向进行采样;其中,所述子区域所对应的纬度越高,所述第一采样间隔越大;
    或者,
    被按照第二采样间隔在纵向进行采样。
  24. 一种服务器,其特征在于,包括:
    划分单元,用于对待处理图像的经纬图或球面图进行横向划分和纵向划分,以得到所述经纬图或所述球面图的各个子区域,其中,所述横向划分的划分位置为预先设定的纬度,所述纵向划分的划分位置由纬度确定,相邻的所述横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,所述纵向划分间隔为相邻的所述纵向划分的划分位置间的距离;
    编码单元,用于对所述得到的各个子区域的图像进行编码。
  25. 根据权利要求24所述的服务器,其特征在于,所述纵向划分的划分位置由纬度确定包括:所述纵向划分的划分位置所处的纬度越高,所述纵向划分间隔越大。
  26. 根据权利要求24或25所述的服务器,其特征在于,还包括采样单元,用于:
    对所述子区域的图像按照第一采样间隔在横向进行采样;其中,所述子区域所所对应的纬度越高,所述第一采样间隔越大;
    所述编码单元用于:
    对所述采样后的各个子区域的图像进行编码。
  27. 根据权利要求26所述的服务器,其特征在于,所述采样单元还用于:
    对所述子区域的图像按照第二采样间隔在纵向进行采样。
  28. 根据权利要求26或27所述的服务器,其特征在于,所述采样单元还用于:
    将所述子区域的图像按照预设尺寸映射为二维平面图像;
    对所述子区域的图像映射的二维平面图像按照所述第一采样间隔在横向进行采样。
  29. 根据权利要求26至28任一项所述的服务器,其特征在于,还包括拼接单元,用于:
    调整所述采样后的各个子区域的位置,使得所述调整后的各个子区域的图像拼接成的图像的横向边缘对齐,且纵向边缘对齐。
  30. 根据权利要求29所述的服务器,其特征在于,所述编码单元用于:
    对所述拼接成的图像分片(Tile)进行编码。
  31. 根据权利要求24-30任一项所述的服务器,其特征在于,还包括封装单元,用于:
    对通过所述编码获得的各个子区域的图像对应的码流进行独立封装,并且编码所述各个子区域的位置信息;其中,所述编码后的所有所述子区域的位置信息与所有所述子区域的码流存在于一个轨迹中;或者,所述编码后的每个所述子区域的位置信息与码流存在于各自的轨迹中;或者,所述编码后的所有所述子区域的位置信息存在于媒体呈现描述(MPD);或者,所述编码后的所有所述子区域的位置信息存在于私有文件中,且所述私有文件的地址存在于MPD中;或者,所述编码后的每个所述子区域的位置信息存在于所述每个子区域的码流的辅助增强信息(SEI)中。
  32. 根据权利要求31所述的服务器,其特征在于,当所述子区域由对所述待处理图像的经纬图进行横向划分和纵向划分得到时,所述采样后的各个子区域形成采样后的经纬图,所述位置信息包括所述子区域在所述经纬图中的位置以及尺寸大小,以及所述子区域在所述采样后的经纬图中的位置以及尺寸大小;或者,所述位置信息包括所述子区域在所述经纬图中的位置以及尺寸大小,以及所述子区域在所述拼接成的图像中的位置以及尺寸大小;或者,当所述子区域由对所述待处理图像的球面图进行横向划分和纵向划分得到时,所述采样后的各个子区域形成采样后的球面图,所述位置信息包括所述子区域在所述球面图的图像中的位置以及经纬度范围,以及所述子区域在所述采样后的球面图的图像中的位置以及尺寸大小;或者,所述位置信息包括所述子区域在所述球面图图像中的位置以及经纬度范围,以及所述子区域在所述拼接成的图像中的位置以及尺寸大小。
  33. 根据权利要求31或32所述的服务器,其特征在于,所述私有文件还包括用于表征用户视点与所述用户视点的视角覆盖的子区域的编号的对应关系的信息。
  34. 根据权利要求32至33任一项所述的服务器,其特征在于,所述私有文件还包括用于表征所述用户视角覆盖的子区域中需优先显示的子区域个数的信息、所述需优先显示的子区域编号的信息、次优先显示的子区域编号的信息以及不显示的子区域编号的信息。
  35. 根据权利要求24至34任一项所述的服务器,其特征在于,所述经纬图包括左眼对应的经纬图和右眼对应的经纬图;
    所述划分单元用于:
    将所述左眼对应的经纬图与所述右眼对应的经纬图进行分割;
    对所述左眼对应的经纬图进行所述横向划分和所述纵向划分,以及对所述右眼对应的经纬图进行所述横向划分和所述纵向划分。
  36. 根据权利要求32至35任一项所述的服务器,其特征在于,还包括传输单元,用于:
    将所述通过所述编码获得的各个子区域的图像对应的码流发送给终端;
    或者,接收终端发送的视角信息,根据所述视角信息获取所述视角信息对应的子区域,将所述视角信息对应的子区域的码流发送给所述终端;
    或者,接收终端发送的子区域的编号,将所述子区域的编号对应的码流发送给所述终端。
  37. 根据权利要求24-36任一项所述的服务器,其特征在于,
    所述经纬图为360度全景视频图像的经纬图,或所述360度全景视频图像的经纬 图的一部分;或
    所述球面图为360度全景视频图像的球面图,或所述360度全景视频图像的球面图的一部分。
  38. 一种终端,其特征在于,包括:
    获取单元,用于确定全景图像各个子区域的位置信息;
    所述获取单元还用于,根据所述确定的各个子区域的位置信息,确定当前视角覆盖的子区域在所述全景图像中的位置信息,并确定子区域的第一采样间隔;
    所述获取单元,还用于根据所述确定的当前视角覆盖的子区域位置信息,获取所述当前视角覆盖的子区域对应的码流;
    解码单元,用于对所述码流进行解码,以得到所述当前视角覆盖的子区域的图像;
    重采样单元,用于根据所述确定的当前视角覆盖的子区域的位置信息以及所述第一采样间隔对所述解码后的图像进行重采样;
    播放单元,用于所述重采样后的图像进行播放。
  39. 根据权利要求38所述的终端,其特征在于,所述获取单元用于:
    接收服务器发送的第一信息,所述第一信息包括所述全景图像的各个子区域的轨迹以及所述各个子区域的码流,所述轨迹包括所述全景图像的所有子区域的位置信息;
    根据所述轨迹,得到所述全景图像中的各个子区域的位置信息。
  40. 根据权利要求38所述的终端,其特征在于,所述获取单元用于:
    接收服务器发送的媒体呈现描述(MPD),其中,所述MPD包括所述各个子区域的位置信息,或者,所述MPD中包括私有文件的地址,且所述私有文件包括所述各个子区域的位置信息;
    所述获取单元还用于:解析所述MPD,以获取所述各个子区域的位置信息。
  41. 根据权利要求38所述的终端,其特征在于,所述子区域的位置信息存在于所述子区域对应的码流的辅助增强信息(SEI)中。
  42. 根据权利要求38-41任一项所述的终端,其特征在于,所述获取单元用于:
    从所述终端的存储器中获取所述当前视角覆盖的子区域对应的码流;
    或者,向服务器请求获取所述当前视角覆盖的子区域对应的码流。
  43. 根据权利要求42所述的终端,其特征在于,所述获取单元用于:
    将指示所述当前视角的信息发送给所述服务器,接收所述服务器发送的所述当前视角覆盖的子区域对应的码流;
    或者,按照所述终端与所述服务器预设的协议,从所述服务器获取所述当前视角覆盖的子区域对应的码流,所述协议包括视角与该视角覆盖的子区域的对应关系。
  44. 根据权利要求39至43任一项所述的终端,其特征在于,所述获取单元用于:
    确定预设的采样间隔为所述第一采样间隔;
    或者,从所述服务器接收所述第一采样间隔;
    或者,根据从所述服务器接收到的所述各个子区域的位置信息获取所述第一采样间隔。
  45. 一种服务器,其特征在于,包括:
    存储单元,用于保存全景图像的经纬图或球面图的各个子区域的图像对应的码流, 所述子区域根据对所述全景图像的经纬图或球面图进行横向划分和纵向划分得到,其中,所述横向划分的划分位置为预先设定的纬度,所述纵向划分的划分位置由纬度确定,相邻的所述横向划分的划分位置所构成的区域中至少存在两种纵向划分间隔,所述纵向划分间隔为相邻的所述纵向划分的划分位置间的距离;
    传输单元,用于向终端发送所述终端请求的所述保存的各个子区域的图像对应的码流中当前视角覆盖的子区域的码流。
  46. 根据权利要求45所述的服务器,其特征在于,所述保存于所述服务器的子区域对应的图像在被编码之前,被按照第一采样间隔在横向进行采样;其中,所述子区域所对应的纬度越高,所述第一采样间隔越大;
    或者,被按照第二采样间隔在纵向进行采样。
PCT/CN2018/081177 2017-07-31 2018-03-29 一种图像的处理方法、终端和服务器 WO2019024521A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
BR112020002235-7A BR112020002235A2 (pt) 2017-07-31 2018-03-29 método de processamento de imagens, terminal, e servidor
CA3069034A CA3069034C (en) 2017-07-31 2018-03-29 Image processing method, terminal, and server
RU2020108306A RU2764462C2 (ru) 2017-07-31 2018-03-29 Сервер, оконечное устройство и способ обработки изображений
EP18841334.8A EP3633993A4 (en) 2017-07-31 2018-03-29 IMAGE PROCESSING METHOD, TERMINAL, AND SERVER
SG11201913824XA SG11201913824XA (en) 2017-07-31 2018-03-29 Image processing method, terminal, and server
AU2018311589A AU2018311589B2 (en) 2017-07-31 2018-03-29 Image processing method, terminal, and server
JP2019571623A JP6984841B2 (ja) 2017-07-31 2018-03-29 イメージ処理方法、端末およびサーバ
KR1020207001630A KR102357137B1 (ko) 2017-07-31 2018-03-29 이미지 처리 방법, 단말기, 및 서버
PH12020500047A PH12020500047A1 (en) 2017-07-31 2020-01-06 Image processing method, terminal and server
US16/739,568 US11032571B2 (en) 2017-07-31 2020-01-10 Image processing method, terminal, and server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710645108.XA CN109327699B (zh) 2017-07-31 2017-07-31 一种图像的处理方法、终端和服务器
CN201710645108.X 2017-07-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/739,568 Continuation US11032571B2 (en) 2017-07-31 2020-01-10 Image processing method, terminal, and server

Publications (1)

Publication Number Publication Date
WO2019024521A1 true WO2019024521A1 (zh) 2019-02-07

Family

ID=65232513

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/081177 WO2019024521A1 (zh) 2017-07-31 2018-03-29 一种图像的处理方法、终端和服务器

Country Status (12)

Country Link
US (1) US11032571B2 (zh)
EP (1) EP3633993A4 (zh)
JP (1) JP6984841B2 (zh)
KR (1) KR102357137B1 (zh)
CN (1) CN109327699B (zh)
AU (1) AU2018311589B2 (zh)
BR (1) BR112020002235A2 (zh)
CA (1) CA3069034C (zh)
PH (1) PH12020500047A1 (zh)
RU (1) RU2764462C2 (zh)
SG (1) SG11201913824XA (zh)
WO (1) WO2019024521A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020198164A1 (en) * 2019-03-26 2020-10-01 Pcms Holdings, Inc. System and method for multiplexed rendering of light fields
WO2021016176A1 (en) * 2019-07-23 2021-01-28 Pcms Holdings, Inc. System and method for adaptive lenslet light field transmission and rendering
CN113766272A (zh) * 2020-06-04 2021-12-07 腾讯科技(深圳)有限公司 一种沉浸媒体的数据处理方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102598082B1 (ko) * 2016-10-28 2023-11-03 삼성전자주식회사 영상 표시 장치, 모바일 장치 및 그 동작방법
US11546402B2 (en) * 2019-01-04 2023-01-03 Tencent America LLC Flexible interoperability and capability signaling using initialization hierarchy
CN111813875B (zh) * 2019-04-11 2024-04-05 浙江宇视科技有限公司 地图点位信息处理方法、装置及服务器
US20220239947A1 (en) * 2019-05-23 2022-07-28 Vid Scale, Inc. Video-based point cloud streams
CN111212267A (zh) * 2020-01-16 2020-05-29 聚好看科技股份有限公司 一种全景图像的分块方法及服务器
CN111314739B (zh) * 2020-02-17 2022-05-17 聚好看科技股份有限公司 一种图像处理方法、服务器及显示设备
CN114760522A (zh) * 2020-12-29 2022-07-15 阿里巴巴集团控股有限公司 数据处理方法、装置和设备
CN113347403B (zh) * 2021-04-19 2023-10-27 浙江大学 一种图像处理方法及装置
CN113553452A (zh) * 2021-06-16 2021-10-26 浙江科技学院 一种基于虚拟现实的空间性域名处理方法及装置
CN113630622B (zh) * 2021-06-18 2024-04-26 中图云创智能科技(北京)有限公司 全景视频图像处理方法、服务端、目标设备、装置和系统
CN114268835B (zh) * 2021-11-23 2022-11-01 北京航空航天大学 一种低传输流量的vr全景视频时空切片方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150264259A1 (en) * 2014-03-17 2015-09-17 Sony Computer Entertainment Europe Limited Image processing
CN106063277A (zh) * 2014-03-03 2016-10-26 奈克斯特Vr股份有限公司 用于对内容进行流传输的方法和装置
CN106375760A (zh) * 2016-10-11 2017-02-01 上海国茂数字技术有限公司 一种全景视频多边形采样方法及装置
CN106899840A (zh) * 2017-03-01 2017-06-27 北京大学深圳研究生院 全景图像映射方法

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6331869B1 (en) * 1998-08-07 2001-12-18 Be Here Corporation Method and apparatus for electronically distributing motion panoramic images
RU2504917C2 (ru) * 2008-10-07 2014-01-20 Телефонактиеболагет Лм Эрикссон (Пабл) Файл медиаконтейнера
US9948939B2 (en) * 2012-12-07 2018-04-17 Qualcomm Incorporated Advanced residual prediction in scalable and multi-view video coding
US10244253B2 (en) * 2013-09-13 2019-03-26 Qualcomm Incorporated Video coding techniques using asymmetric motion partitioning
EP3092806A4 (en) * 2014-01-07 2017-08-23 Nokia Technologies Oy Method and apparatus for video coding and decoding
US9628822B2 (en) * 2014-01-30 2017-04-18 Qualcomm Incorporated Low complexity sample adaptive offset encoding
GB2525170A (en) * 2014-04-07 2015-10-21 Nokia Technologies Oy Stereo viewing
JP6642427B2 (ja) 2014-06-30 2020-02-05 ソニー株式会社 情報処理装置および方法
US10462480B2 (en) * 2014-12-31 2019-10-29 Microsoft Technology Licensing, Llc Computationally efficient motion estimation
US10666979B2 (en) 2015-03-05 2020-05-26 Sony Corporation Image processing device and image processing method for encoding/decoding omnidirectional image divided vertically
WO2017116952A1 (en) * 2015-12-29 2017-07-06 Dolby Laboratories Licensing Corporation Viewport independent image coding and rendering
CN105872546B (zh) * 2016-06-13 2019-05-28 上海杰图软件技术有限公司 一种实现全景图像压缩存储的方法和系统
CN106530216B (zh) * 2016-11-05 2019-10-29 深圳岚锋创视网络科技有限公司 全景影像文件处理方法及系统
US10574886B2 (en) * 2017-11-02 2020-02-25 Thermal Imaging Radar, LLC Generating panoramic video for video management systems

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106063277A (zh) * 2014-03-03 2016-10-26 奈克斯特Vr股份有限公司 用于对内容进行流传输的方法和装置
US20150264259A1 (en) * 2014-03-17 2015-09-17 Sony Computer Entertainment Europe Limited Image processing
CN106375760A (zh) * 2016-10-11 2017-02-01 上海国茂数字技术有限公司 一种全景视频多边形采样方法及装置
CN106899840A (zh) * 2017-03-01 2017-06-27 北京大学深圳研究生院 全景图像映射方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3633993A4

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020198164A1 (en) * 2019-03-26 2020-10-01 Pcms Holdings, Inc. System and method for multiplexed rendering of light fields
US11991402B2 (en) 2019-03-26 2024-05-21 Interdigital Vc Holdings, Inc. System and method for multiplexed rendering of light fields
WO2021016176A1 (en) * 2019-07-23 2021-01-28 Pcms Holdings, Inc. System and method for adaptive lenslet light field transmission and rendering
CN113766272A (zh) * 2020-06-04 2021-12-07 腾讯科技(深圳)有限公司 一种沉浸媒体的数据处理方法
CN113766272B (zh) * 2020-06-04 2023-02-10 腾讯科技(深圳)有限公司 一种沉浸媒体的数据处理方法

Also Published As

Publication number Publication date
KR20200019718A (ko) 2020-02-24
CA3069034A1 (en) 2019-02-07
RU2020108306A3 (zh) 2021-08-26
RU2764462C2 (ru) 2022-01-17
CN109327699A (zh) 2019-02-12
CN109327699B (zh) 2021-07-16
AU2018311589A1 (en) 2020-02-20
US20200154138A1 (en) 2020-05-14
KR102357137B1 (ko) 2022-02-08
BR112020002235A2 (pt) 2020-07-28
EP3633993A1 (en) 2020-04-08
CA3069034C (en) 2024-05-28
SG11201913824XA (en) 2020-01-30
JP2020529149A (ja) 2020-10-01
US11032571B2 (en) 2021-06-08
RU2020108306A (ru) 2021-08-26
JP6984841B2 (ja) 2021-12-22
AU2018311589B2 (en) 2022-12-22
EP3633993A4 (en) 2020-06-24
PH12020500047A1 (en) 2020-09-28

Similar Documents

Publication Publication Date Title
WO2019024521A1 (zh) 一种图像的处理方法、终端和服务器
KR102307819B1 (ko) 레거시 및 몰입형 렌더링 디바이스를 위한 몰입형 비디오를 포맷팅하는 방법, 장치 및 스트림
KR102559862B1 (ko) 미디어 콘텐츠 전송을 위한 방법, 디바이스, 및 컴퓨터 프로그램
US20200112710A1 (en) Method and device for transmitting and receiving 360-degree video on basis of quality
CN112189345B (zh) 用于编码或解码表示3d场景的数据的方法、设备或介质
CN109362242A (zh) 一种视频数据的处理方法及装置
CN110741649B (zh) 用于轨道合成的方法及装置
CN113852829A (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
WO2023061131A1 (zh) 媒体文件封装方法、装置、设备及存储介质
CN110637463B (zh) 360度视频处理方法
JP2022541908A (ja) ボリュメトリックビデオコンテンツを配信するための方法および装置
CN115022715A (zh) 一种沉浸媒体的数据处理方法及设备
CN115567756A (zh) 基于视角的vr视频系统和处理方法
WO2023024841A1 (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
WO2023016293A1 (zh) 自由视角视频的文件封装方法、装置、设备及存储介质
WO2023024839A1 (zh) 媒体文件封装与解封装方法、装置、设备及存储介质
CN113497928B (zh) 一种沉浸媒体的数据处理方法及相关设备
WO2023024843A1 (zh) 媒体文件封装与解封装方法、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18841334

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019571623

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 3069034

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2018841334

Country of ref document: EP

Effective date: 20191230

Ref document number: 20207001630

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020002235

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 2018311589

Country of ref document: AU

Date of ref document: 20180329

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112020002235

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20200131