WO2024037643A1 - 图像显示方法、图像处理方法、装置、设备及介质 - Google Patents

图像显示方法、图像处理方法、装置、设备及介质 Download PDF

Info

Publication number
WO2024037643A1
WO2024037643A1 PCT/CN2023/113854 CN2023113854W WO2024037643A1 WO 2024037643 A1 WO2024037643 A1 WO 2024037643A1 CN 2023113854 W CN2023113854 W CN 2023113854W WO 2024037643 A1 WO2024037643 A1 WO 2024037643A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
image
target
encoded data
perspective
Prior art date
Application number
PCT/CN2023/113854
Other languages
English (en)
French (fr)
Inventor
彭浩翔
高国栋
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2024037643A1 publication Critical patent/WO2024037643A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/15Processing image signals for colour aspects of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof

Definitions

  • Embodiments of the present disclosure relate to an image display method, an image processing method, an apparatus, a device, and a medium.
  • the client needs to display two-dimensional images of the three-dimensional object model from multiple perspectives.
  • the three-dimensional object model In order to display two-dimensional images corresponding to multiple viewing angles of the three-dimensional object model on the client, the three-dimensional object model is generally directly downloaded and processed, and then the two-dimensional images of the three-dimensional object model at each viewing angle are displayed.
  • the spatial information of the 3D object model is relatively complex and takes up a large amount of memory. Direct processing of the 3D object model takes a long time, resulting in poor real-time performance of the display process of the 3D object model, which ultimately reduces the user's viewing experience.
  • the present disclosure provides an image display method, image processing method, device, equipment and medium.
  • the present disclosure provides an image display method, applied to a client, and the method includes:
  • the present disclosure provides an image processing method applied to a server.
  • the method includes:
  • the two-dimensional image set is sent to the client, so that the client parses and displays the two-dimensional target image corresponding to the target perspective.
  • the present disclosure provides an image display device, which is configured on a client and includes:
  • a receiving module configured to receive a two-dimensional image set sent by the server, where the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model at multiple different viewing angles;
  • An analysis module configured to respond to an instruction to display the three-dimensional object model in a target perspective, and analyze the two-dimensional target image corresponding to the target perspective from the two-dimensional image set;
  • An image display module is used to display the two-dimensional target image.
  • the present disclosure provides an image processing device, which is configured on a server and includes:
  • An acquisition module configured to acquire a set of two-dimensional images generated by a three-dimensional object model at multiple viewing angles, wherein the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model at multiple different viewing angles;
  • a sending module configured to send the two-dimensional image set to the client, so that the client parses and displays the two-dimensional target image corresponding to the target perspective.
  • the present disclosure provides a computer-readable storage medium. Instructions are stored in the computer-readable storage medium. When the instructions are run on a terminal device, the terminal device implements the above method.
  • the present disclosure provides a device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • a device including: a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, the above method is implemented.
  • the present disclosure provides a computer program product.
  • the computer program product includes a computer program/instruction. When the computer program/instruction is executed by a processor, the above method is implemented.
  • Figure 1 is a schematic flowchart of an image display method provided by an embodiment of the present disclosure
  • Figure 2 is a logical schematic diagram of an image display method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.
  • Figure 4 is a schematic diagram of acquiring a two-dimensional image provided by an embodiment of the present disclosure
  • Figure 5 is a logical schematic diagram of an image processing and display method provided by an embodiment of the present disclosure.
  • Figure 6 is a schematic structural diagram of an image display device provided by an embodiment of the present disclosure.
  • Figure 7 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • Figure 8 is a schematic structural diagram of a client or server provided by an embodiment of the present disclosure.
  • 3D object models have high definition and texture complexity.
  • downloading and processing of 3D object models require high network speed and memory, making it difficult to meet the demand for real-time image display.
  • embodiments of the present disclosure provide an image display method, device, equipment and medium.
  • the image display method can be applied to the client.
  • clients can include but are not limited to mobile phones, tablets, laptops, desktop computers, smart homes, wearable devices, and vehicle-mounted devices.
  • FIG. 1 shows a schematic flowchart of an image display method provided by an embodiment of the present disclosure. As shown in the picture As shown in 1, the image display method includes the following steps.
  • the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model from multiple different perspectives.
  • the client downloads and stores a two-dimensional image collection from the server. Since the two-dimensional image collection records two-dimensional images of the three-dimensional object model at multiple different perspectives, the two-dimensional image collection can Provides two-dimensional images from any viewing angle.
  • the three-dimensional object model may be a three-dimensional model of the object to be displayed.
  • the three-dimensional object model may have color features, transparency features, or both color features and transparency features.
  • the viewing angle can be understood as the screen viewing angle.
  • the perspective can be in the form of latitude and longitude.
  • the latitude range can be [-90,90] and the longitude range [0,360].
  • the two-dimensional image refers to the image of the three-dimensional object model at any viewing angle.
  • the client obtains the external input display instruction carrying the target perspective, and then determines the two-dimensional target image corresponding to the target perspective based on the two-dimensional image collection. Specifically, if the two-dimensional image set includes multiple two-dimensional images corresponding to different viewing angles, the two-dimensional target image corresponding to the target viewing angle can be directly searched from the two-dimensional image set; if the two-dimensional image set includes multiple two-dimensional images corresponding to different viewing angles, For the encoded data of the two-dimensional image, the encoded data corresponding to the target perspective is decoded to obtain the two-dimensional target image corresponding to the target perspective.
  • the display instruction is a request for triggering the client to perform image display.
  • the display instruction can be triggered by the user or automatically generated by the client when an application jump occurs.
  • the server generates 2D images of the 3D object model from multiple perspectives in advance. 2D images, and generate a 2D image collection that records 2D images at multiple different perspectives, and then download the 2D image collection to the client. There is no need to rely on the client to directly download and process the 3D object model, so it can be quickly The two-dimensional target image corresponding to the target perspective is obtained.
  • the client determines the two-dimensional target image corresponding to the target perspective
  • the two-dimensional target image can be directly displayed.
  • the client obtains the target perspective corresponding to each switching operation and displays the target corresponding to each switching operation on the short video playback interface.
  • a 2D target image from a viewing angle.
  • Embodiments of the present disclosure provide an image display method that receives a two-dimensional image set sent by a server.
  • the two-dimensional image set is used to record two-dimensional images of a three-dimensional object model at multiple different viewing angles; in response to the three-dimensional object model in the target
  • the display instruction under the viewing angle is to analyze the two-dimensional target image corresponding to the target viewing angle from the two-dimensional image collection; display the two-dimensional target image.
  • the client can directly determine the two-dimensional target image of the target perspective from the two-dimensional image collection for display, and the complexity and memory occupation of the two-dimensional target image are small, so the process of displaying the image by the client does not occupy Excessive network resources and memory are used to avoid lags in the image display process, ultimately improving the user's viewing experience.
  • the client can obtain the encoding data corresponding to the two-dimensional target image from the target perspective from the two-dimensional image collection, and parse the encoding data to determine the two-dimensional target image.
  • S120 may specifically include the following steps:
  • the client since the two-dimensional image set includes encoded data corresponding to the two-dimensional images of the three-dimensional object model at multiple different perspectives, the client first obtains the two-dimensional target image from the target perspective from the two-dimensional image collection based on the target perspective. The corresponding encoded data is then parsed using a decoder such as h265 to obtain a two-dimensional target image corresponding to the target perspective.
  • a decoder such as h265
  • the encoded data may be a compressed encoding product in binary format corresponding to each perspective. Specifically, the encoded data corresponding to multiple two-dimensional images at different viewing angles is obtained in advance by the server by compressing and encoding the two-dimensional images at different viewing angles.
  • the client only needs to obtain the encoded data corresponding to the two-dimensional target image from the target perspective from the two-dimensional image set received by the server and parse the encoded data, and then it can obtain the two-dimensional target image corresponding to the target perspective and display it. Therefore, the image display process only uses the decoding capability of the client, has good compatibility, and has low requirements on the client.
  • the decoding process is simple, easy to implement, and can be batched. production, so the resulting product is smaller, allowing the client to download the encoded data faster.
  • the encoded data includes different attribute information, and the different attribute information is used to mark different information, so that the client displays the corresponding two-dimensional target image based on the different attribute information.
  • the encoded data includes first attribute information of the two-dimensional image, wherein the size information in the first attribute information is used to mark the display size of the two-dimensional image, and according to the viewing angle information in the first attribute information Determine the target perspective.
  • the client when it parses the size information from the first attribute information, it can determine the display size of the two-dimensional image based on the size information.
  • the display size refers to the display size of the two-dimensional image.
  • the display size may include length, width, height, etc.
  • the viewing angle information can be used to mark the viewing angle corresponding to the two-dimensional image.
  • the client parses the perspective information from the first attribute information, it can determine the target perspective based on the perspective information.
  • the client accurately determines the two-dimensional target image by parsing the perspective information in the first attribute information included in the encoded data, and displays the two-dimensional target image of the corresponding size based on the size information in the first attribute information.
  • dimensional target image ensuring the accuracy of analysis of encoded data and accuracy of image display.
  • the encoded data includes second attribute information of the two-dimensional image, where the second attribute information is used to mark the position information of different channels of the two-dimensional image in the encoded data.
  • S1202 may specifically include the following steps:
  • the position information may be the byte offset and byte length of the encoded data in different channels of the two-dimensional image. That is to say, the two-dimensional image corresponding to each viewing angle has a corresponding byte offset and byte length, and the encoded data of the two-dimensional image corresponding to each viewing angle includes encoded data of different channels. It can be seen from this that based on the position information of different channels of the two-dimensional target image in the encoded data, the encoding of the different channels corresponding to the target perspective can be obtained from the encoded data.
  • the color channel may be a YUV channel, and the data corresponding to the color channel is data in YUV format.
  • Color channel encoding refers to the encoding data corresponding to the color channels of a two-dimensional image.
  • the server can first obtain multiple RGB format images as two-dimensional images corresponding to multiple viewing angles, and then obtain the YUV data from the RGB format images.
  • Channel extracts image data to obtain image data of two-dimensional images corresponding to multiple viewing angles in the YUV channel, and then compresses and codes the image data of two-dimensional images corresponding to multiple viewing angles in the YUV channel to obtain two-dimensional images corresponding to multiple viewing angles.
  • the encoded data of the YUV channel, and the YUV channel of each two-dimensional image corresponds to position information in the encoded data.
  • the client obtains the coded data corresponding to multiple viewing angles, it can directly obtain the color channel code corresponding to the target perspective from the coded data based on the position information of the color channel of the two-dimensional target image in the coded data.
  • the transparency channel can be an Alpha channel, and the data corresponding to the transparency channel is grayscale data.
  • Transparency channel encoding refers to the encoding data corresponding to the transparency channel of a two-dimensional image.
  • the server can first obtain multiple RGB format images as two-dimensional images corresponding to multiple viewing angles, and then obtain the Alpha values from the RGB format images.
  • the channel extracts the image data to obtain the image data of the two-dimensional images corresponding to multiple viewing angles in the Alpha channel, and then compresses and codes the image data of the two-dimensional images corresponding to the multiple viewing angles in the Alpha channel to obtain the two-dimensional images corresponding to the multiple viewing angles.
  • the encoded data of the Alpha channel, and the Alpha channel of each two-dimensional image corresponds to position information in the encoded data. In this way, after the client obtains the coded data corresponding to multiple viewing angles, it can directly obtain the transparency channel code corresponding to the target perspective from the coded data based on the position information of the transparency channel of the two-dimensional target image in the coded data.
  • the client after the client obtains the color channel coding and transparency channel coding, it can fuse the color channel coding and transparency channel coding based on the position of the pixel to generate a two-dimensional target image. image and display. As a result, the client can display a two-dimensional target image that combines color data and transparency data to meet the user's needs for viewing color and transparency.
  • the client can also generate and display a two-dimensional target image based on color channel encoding only, or the client can also generate and display a two-dimensional target image based on transparency channel encoding only.
  • the client can display a single channel 2D target image to the user.
  • the encoded data includes different data frame types, and the encoding data corresponding to different frame types are obtained in different ways. Therefore, in order to ensure that the encoded data of all data frame types can be found, in some embodiments, the encoded data also includes third attribute information of the two-dimensional image, where the third attribute information is used to mark different channels of the two-dimensional image during encoding.
  • Data frame type in data
  • S12022 may specifically include the following steps:
  • the data frame type of the color channel corresponding to the target perspective is a non-key frame type
  • the data frame type of the color channel corresponding to the target perspective is a key frame type
  • S12023 may specifically include the following steps:
  • the data frame type of the transparency channel corresponding to the target perspective is a non-key frame type
  • the data frame type of the transparency corresponding to the target perspective is a key frame type
  • the key frame code of the transparency channel corresponding to the target perspective is obtained as the key frame code of the transparency channel corresponding to the target perspective as the transparency channel code corresponding to the target perspective.
  • the data frame type is the frame type of the two-dimensional image
  • the two-dimensional images corresponding to different viewing angles correspond to unique frame types.
  • the two-dimensional image may include key frame type images and non-key frame images
  • the encoding data may include key frame encoding and non-key frame encoding.
  • the obtained two-dimensional images from multiple perspectives can be grouped, and the two-dimensional images in each group can be compressed and encoded.
  • the two-dimensional images corresponding to nine consecutively spaced perspectives can be manually divided into a group, in which a two-dimensional image located in the center of each group of two-dimensional images is marked as a two-dimensional image of the key frame type.
  • the eight two-dimensional images located at non-center positions in the group of two-dimensional images are marked as non-keyframe type two-dimensional images.
  • the server compresses and codes each set of two-dimensional images, it can determine that the coding data corresponding to a two-dimensional image located in the center of each set of two-dimensional images is a key frame code, and determine that the coded data corresponding to a two-dimensional image located in the non-center position of each set of two-dimensional images
  • the encoding data corresponding to the eight two-dimensional images is non-key frame encoding.
  • the compression codes corresponding to the two-dimensional images of the key frame type are all used as key frame codes
  • the compression codes corresponding to the two-dimensional images of the non-key frame type are One part is encoded as a key frame and the other part is encoded as a non-key frame. Therefore, for color channels, when the client obtains the encoded data of different channels corresponding to the target perspective, if the data frame type of the color channel corresponding to the target perspective is a non-key frame type, it needs to obtain the data frame of the color channel corresponding to the target perspective.
  • the color channel coding corresponding to the target perspective is generated; if the data frame type of the color channel corresponding to the target perspective is the key frame type, Then the key frame coding of the color channel corresponding to the target perspective is directly used as the color channel coding corresponding to the target perspective.
  • the key frame coding corresponding to the target perspective refers to the coding data of the key frame in the group where the target perspective is located.
  • the non-key frame encoding corresponding to the target perspective refers to the encoding data actually corresponding to the perspective. It should be noted that the principle of determining the transparency channel encoding is the same as that of determining the color channel encoding, and will not be described again here.
  • color channel coding and transparency channel coding corresponding to each set of two-dimensional images can be expressed in the following way:
  • the color channel coding corresponding to each group of two-dimensional images in the above table includes a key frame (I) code and 8 non-key frame (P) codes
  • the transparency channel coding corresponding to each group of two-dimensional images includes a key Frame (I) encoding and 8 non-key frame (P) encoding.
  • the encoded data of different data frame types can be obtained based on different logic.
  • the comprehensiveness and accuracy of the encoded data acquisition of the two channels are ensured.
  • the encoded data includes a protocol header, where the protocol header includes one or more of first attribute information, second attribute information, and third attribute information. combination.
  • the client can store the encoding protocol header locally, and then obtain and quickly parse the encoded data locally.
  • the method further includes the following steps:
  • S1202 may specifically include the following steps:
  • the target protocol header corresponding to the target perspective from the preset storage structure, parse the encoding data corresponding to the target protocol header, and obtain the two-dimensional target image corresponding to the target perspective.
  • the default storage structure can be a memory storage structure corresponding to the client. Specifically, it can be a Map structure.
  • a Map is a collection that maps key objects and value objects. Each element of it contains a pair of keys. Objects and value objects.
  • the code of the protocol header in the preset storage structure can be as follows: ⁇ "Longitude,Latitude”:[ ⁇ "Frame Type”:"I/P", “Offset”:”Current Offset”, “Length”:”Current Frame Length”, “Alpha_Offset”:”Current Frame Alpha Offset”, “Alpha_Length”:”Current Frame Alpha length”, “I_Offset”:”Reference I Frame Offset”, “I_Length”:”Reference I Frame Length”, “Alpha_I_Offset”:”Reference I Frame Alpha Offset", “Alpha_I_Length”:”Reference I Frame Alpha Length”, ⁇ ⁇ ⁇
  • the decoding stage by parsing the protocol header from the preset storage structure corresponding to the client, the encoded data corresponding to the target perspective can be quickly parsed, thereby improving the parsing efficiency of the two-dimensional target image corresponding to the target perspective and further optimizing image display efficiency.
  • FIG. 2 shows a logical schematic diagram of an image display method provided by an embodiment of the present disclosure.
  • the image display method includes the following processes:
  • S210 Receive a two-dimensional image set sent by the server.
  • the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model from multiple different perspectives.
  • the client can store the protocol header corresponding to each perspective in a preset storage structure, such as a map, based on the protocol header format corresponding to each perspective.
  • a preset storage structure such as a map
  • the size information in the first attribute information is used to mark the display size of the two-dimensional image, and the target angle of view is determined based on the angle of view information in the first attribute information.
  • the second attribute information is used to mark the position information of different channels of the two-dimensional image in the encoded data.
  • the third attribute information is used to mark the data frame types of different channels of the two-dimensional image in the encoded data.
  • an image processing method that reduces network transmission resources and memory usage.
  • the image processing method can be applied to the server.
  • the service The server can be a cloud server or a server cluster.
  • FIG. 3 shows a schematic flowchart of an image processing method provided by an embodiment of the present disclosure. As shown in Figure 3, the image processing method includes the following steps.
  • a two-dimensional image, and the two-dimensional image located in the center of each group of images is set as a key frame type image, and the two-dimensional image located at the non-center position in each group of images is set as a non-key frame type image; for north and south Pole (*, -90), (*, 90), then the two-dimensional images are combined into a group of 6 3*2 images.
  • the pole is a key frame type image
  • the periphery is 5 non-key frame type images.
  • Figure 4 shows a schematic diagram of a two-dimensional image acquisition.
  • a two-dimensional image can be collected every 6 degrees.
  • a two-dimensional image can be collected at (-6,6), (0,0), (-6,0) and other viewing angles, then 9 2D images can be obtained and the 9 2D images are taken as a group.
  • the 2D image at the center is set as a key frame type image
  • the 2D image at the non-center position is set as a non-key frame type.
  • the keyframe type images in each group of two-dimensional images can be used as references for non-keyframe type images, then each group of two-dimensional images includes 1 keyframe type image and 8 non-keyframe type images.
  • S320 Send the two-dimensional image collection to the client, so that the client parses and displays the two-dimensional target image corresponding to the target perspective.
  • the server can directly send a two-dimensional image set including two-dimensional images corresponding to multiple viewing angles to the client, so that the client searches for the two-dimensional target image corresponding to the target viewing angle from the two-dimensional image and displays it.
  • the server may compress and encode two-dimensional images corresponding to multiple viewing angles.
  • the coded data obtained by the code obtains a two-dimensional image set that records two-dimensional images from multiple different viewing angles, and the two-dimensional image set is sent to the client, so that the client parses and displays the two-dimensional target image corresponding to the target perspective.
  • Embodiments of the present disclosure provide an image processing method.
  • the server obtains a two-dimensional image set generated by a three-dimensional object model at multiple viewing angles, where the two-dimensional image set is used to record the two-dimensional image of the three-dimensional object model at multiple different viewing angles.
  • 2D image send the 2D image collection to the client so that the client can parse and display the 2D target image corresponding to the target perspective.
  • the processing of the three-dimensional object model is executed in the server, so that the client can directly obtain and display the two-dimensional target image corresponding to the target perspective. Therefore, the process of displaying the image on the client will not occupy too many network resources and memory, thereby avoiding lagging during the image display process, and ultimately improving the user's interactive experience in viewing the image display process.
  • the server compresses and codes the two-dimensional images corresponding to multiple viewing angles, and then sends the encoded data corresponding to the multiple viewing angles to the client.
  • S310 may specifically include the following steps:
  • the server can use an encoder such as h265 to compress and encode the two-dimensional images from multiple different perspectives, and generate the encoded data corresponding to the two-dimensional images from multiple different perspectives. It is to generate a binary file and use the encoded data from multiple different perspectives as a two-dimensional image collection.
  • an encoder such as h265 to compress and encode the two-dimensional images from multiple different perspectives, and generate the encoded data corresponding to the two-dimensional images from multiple different perspectives. It is to generate a binary file and use the encoded data from multiple different perspectives as a two-dimensional image collection.
  • the encoding data may include parameter encoding and encoding body.
  • parameter encoding refers to the parameters after encoding the two-dimensional image
  • the encoding body refers to the ontology of the encoded data.
  • parameter encoding can include parameters such as video parameter set (VPS_NUT), sequence parameter set (SPS_NUT), and (PPS_NUT).
  • the encoded data corresponding to the two-dimensional image in each group can be expressed in the following way:
  • the server can compress and encode multiple two-dimensional images from different perspectives, generate a smaller binary file, and provide the binary file to the client, so that the client can Download the encoded data within.
  • the encoded data includes a variety of attribute information.
  • the encoded data includes first attribute information of the two-dimensional image, wherein the size information in the first attribute information is used to mark the display size of the two-dimensional image, and the perspective information in the first attribute information is used to mark the two-dimensional image. dimensional image perspective.
  • the encoded data also includes second attribute information of the two-dimensional image, where the second attribute information is used to mark the position information of different channels of the two-dimensional image in the encoded data, so that the client can The data of different channels of the two-dimensional image corresponding to the target perspective are obtained from the encoded data.
  • the encoded data also includes third attribute information of the two-dimensional image, where the third attribute information is used to mark the data frame types of different channels of the two-dimensional image in the encoded data, so that the client can
  • the frame type decodes the data of different channels of the two-dimensional image corresponding to the target perspective.
  • the server can also add a protocol header to the encoded data of the two-dimensional image corresponding to each viewing angle.
  • the server can add the protocol header to the encoded data of the two-dimensional image corresponding to each viewing angle based on the preset protocol header protocol, and then obtain the encoded data carrying the protocol header.
  • the default header file protocol may be an autoregressive (AR) protocol.
  • AR autoregressive
  • the protocol header may include one or more combinations of the above first attribute information, second attribute information and third attribute information.
  • protocol header can be in the following format:
  • the protocol header can mark various information of its corresponding encoded data.
  • FIG. 5 shows a logical schematic diagram of an image processing and display method.
  • the image processing and display process includes the following steps:
  • S540 Compress and encode the image data of the color channels of the two-dimensional images corresponding to multiple viewing angles to obtain color channel coding corresponding to the multiple viewing angles.
  • S550 Compress and encode the image data in the transparency channel of the two-dimensional images corresponding to multiple viewing angles to obtain transparency channel coding corresponding to the multiple viewing angles.
  • S560 Combine the color channel coding corresponding to multiple viewing angles and the transparency channel coding corresponding to multiple viewing angles to obtain coding data of two-dimensional images corresponding to multiple viewing angles.
  • S570 Add protocol headers to the encoded data of the two-dimensional images corresponding to multiple viewing angles to generate a two-dimensional image set generated by the three-dimensional object model at multiple viewing angles.
  • the two-dimensional image set is used to record the three-dimensional object model at multiple different viewing angles. 2D image below.
  • S510 ⁇ S570 are all executed by the server.
  • S580 ⁇ S593 are all executed by the server.
  • the present disclosure also provides an image display device, which is configured on the client.
  • an image display device which is configured on the client.
  • FIG. 6 a schematic structural diagram of an image display device 600 is provided according to an embodiment of the present disclosure.
  • the image display device 600 includes:
  • the receiving module 601 is used to receive a two-dimensional image set sent by the server.
  • the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model at multiple different viewing angles;
  • the parsing module 602 is configured to parse the two-dimensional target image corresponding to the target perspective from the two-dimensional image set in response to the display instruction of the three-dimensional object model in the target perspective;
  • Image display module 603 is used to display the two-dimensional target image.
  • Embodiments of the present disclosure provide an image display device that receives a two-dimensional image set sent by a server.
  • the two-dimensional image set is used to record two-dimensional images of a three-dimensional object model at multiple different viewing angles; in response to the three-dimensional object model in the target
  • the display instruction under the viewing angle is to analyze the two-dimensional target image corresponding to the target viewing angle from the two-dimensional image collection; display the two-dimensional target image.
  • the client The two-dimensional target image of the target perspective can be directly determined from the two-dimensional image collection for display, and the complexity and memory occupation of the two-dimensional target image are small, so the process of displaying the image on the client will not occupy too many network resources and memory, thus avoiding lagging during image display and ultimately improving the user’s viewing experience.
  • the parsing module 602 includes:
  • An acquisition unit configured to acquire the encoded data corresponding to the two-dimensional target image under the target perspective from the two-dimensional image set;
  • An analysis unit is used to analyze the encoded data and obtain a two-dimensional target image corresponding to the target perspective.
  • the encoded data includes first attribute information of the two-dimensional image, wherein the size information in the first attribute information is used to mark the display size of the two-dimensional image, and is determined according to the The viewing angle information in the first attribute information determines the target viewing angle.
  • the encoded data includes second attribute information of the two-dimensional image, wherein the second attribute information is used to mark the positions of different channels of the two-dimensional image in the encoded data. information;
  • the parsing unit is specifically configured to obtain, from the second attribute information, the position information of the color channel of the two-dimensional target image in the encoded data and the position information of the transparency channel of the two-dimensional target image in the encode location information in data;
  • the color channel coding and the transparency channel coding are fused to generate a two-dimensional target image corresponding to the target perspective.
  • the encoded data further includes third attribute information of the two-dimensional image, wherein the third attribute information is used to mark different channels of the two-dimensional image in the encoded data.
  • the parsing unit is further configured to obtain, from the encoded data, the data frame type of the color channel corresponding to the target perspective according to the third attribute information;
  • the data frame type of the color channel corresponding to the target perspective is a non-keyframe type
  • the data frame type of the color channel corresponding to the target perspective is a key frame type
  • Key frame coding is performed, and the key frame coding of the color channel corresponding to the target perspective is used as the color channel coding corresponding to the target perspective.
  • the data parsing unit is further configured to, according to the third attribute information, obtain the data frame type of the transparency channel corresponding to the target perspective from the encoded data;
  • the data frame type of the transparency channel corresponding to the target perspective is a non-key frame type
  • the data frame type of the transparency corresponding to the target perspective is a key frame type
  • the encoded data includes a protocol header, wherein the protocol header includes one or more combinations of first attribute information, second attribute information, and third attribute information.
  • the present disclosure also provides an image processing device, which is configured on a server.
  • an image processing device which is configured on a server.
  • FIG. 7 a schematic structural diagram of an image processing device 700 is provided according to an embodiment of the present disclosure.
  • the image processing device 700 includes:
  • the acquisition module 701 is used to acquire a set of two-dimensional images generated by a three-dimensional object model at multiple viewing angles, where the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model at multiple different viewing angles;
  • the sending module 702 is used to send the two-dimensional image set to the client, so that the client parses and displays the two-dimensional target image corresponding to the target perspective.
  • Embodiments of the present disclosure provide an image processing device.
  • the server obtains a three-dimensional object model and A two-dimensional image set generated from multiple viewing angles, where the two-dimensional image set is used to record two-dimensional images of the three-dimensional object model at multiple different viewing angles; the two-dimensional image set is sent to the client so that the client can parse and The two-dimensional target image corresponding to the target perspective is displayed.
  • the processing of the three-dimensional object model is executed in the server, so that the client can directly obtain and display the two-dimensional target image corresponding to the target perspective. Therefore, the process of displaying the image on the client will not occupy too many network resources and memory, thereby avoiding lagging during the image display process, and ultimately improving the user's interactive experience in viewing the image display process.
  • the sending module 702 includes:
  • a compression encoding unit configured to compress and encode the two-dimensional images at the plurality of different viewing angles, generate the encoding data at the plurality of different viewing angles, and use the encoding data at the plurality of different viewing angles as the two-dimensional images.
  • dimensional image collection configured to compress and encode the two-dimensional images at the plurality of different viewing angles, generate the encoding data at the plurality of different viewing angles, and use the encoding data at the plurality of different viewing angles as the two-dimensional images.
  • the encoded data includes first attribute information of the two-dimensional image, wherein the size information in the first attribute information is used to mark the display size of the two-dimensional image, so The perspective information in the first attribute information is used to mark the perspective of the two-dimensional image.
  • the encoded data further includes second attribute information of the two-dimensional image, wherein the second attribute information is used to mark different channels of the two-dimensional image in the encoded data.
  • the position information in the encoding data allows the client to obtain the data of different channels of the two-dimensional image corresponding to the target perspective from the encoded data according to the position information.
  • the encoded data further includes third attribute information of the two-dimensional image, wherein the third attribute information is used to mark different channels of the two-dimensional image in the encoded data.
  • the data frame type in the data frame type allows the client to decode data of different channels of the two-dimensional image corresponding to the target perspective according to the data frame type.
  • embodiments of the present disclosure also provide a computer-readable storage medium. Instructions are stored in the computer-readable storage medium. When the instructions are run on a terminal device, the terminal device implements the embodiments of the present disclosure. Image display method or image processing method.
  • An embodiment of the present disclosure also provides a computer program product.
  • the computer program product includes a computer program/instruction.
  • the computer program/instruction is executed by a processor, the image display method or the image processing method of the embodiment of the present disclosure is implemented.
  • Figure 8 shows a schematic structural diagram of a client or server provided by an embodiment of the present disclosure.
  • the client or server may include a controller 801 and stored computing Memory 802 for machine program instructions.
  • controller 801 may include a central processing unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits according to the embodiments of the present application.
  • CPU central processing unit
  • ASIC Application Specific Integrated Circuit
  • Memory 802 may include bulk storage for information or instructions.
  • the memory 802 may include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disk, a magneto-optical disk, a magnetic tape, or a Universal Serial Bus (USB) drive or both. A combination of the above.
  • Memory 802 may include removable or non-removable (or fixed) media, where appropriate.
  • Memory 802 may be internal or external to the integrated gateway device, where appropriate.
  • memory 802 is non-volatile solid-state memory.
  • memory 802 includes read-only memory (ROM).
  • the ROM can be a mask-programmed ROM, programmable ROM (Programmable ROM, PROM), erasable PROM (Electrical Programmable ROM, EPROM), electrically erasable PROM (Electrically Erasable Programmable ROM, EEPROM) ), electrically rewritable ROM (Electrically Alterable ROM, EAROM) or flash memory, or a combination of two or more of these.
  • the controller 801 reads and executes the computer program instructions stored in the memory 802 to perform the steps of the image display method provided by the embodiment of the present disclosure, or perform the steps of the image processing method provided by the embodiment of the present disclosure.
  • the client or server may also include a transceiver 803 and a bus 804.
  • the controller 801, the memory 802 and the transceiver 803 are connected through the bus 804 and complete communication with each other.
  • Bus 804 includes hardware, software, or both.
  • the bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Extended Industry Standard Architecture (EISA) bus, a Front Side BUS (FSB), an Ultra Transmission (Hyper Transport, HT) interconnect, Industrial Standard Architecture (ISA) bus, infinite bandwidth interconnect, Low Pin Count (LPC) bus, memory bus, Micro Channel Architecture , MCA) bus, Peripheral Component Interconnect (PCI) bus, PCI-Express (PCI-X) bus, Serial Advanced Technology Attachment (SATA) bus, Video Electronics Standards Association Local Bus (VLB) bus or other suitable bus or a combination of two or more of these.
  • bus 804 may include one or more buses.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Graphics (AREA)
  • Databases & Information Systems (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种图像显示方法、图像处理方法、装置、设备及介质,该图像显示方法包括:接收服务器发送的二维图像集合,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与目标视角对应的二维目标图像;显示二维目标图像。通过上述过程,客户端能够从二维图像集合中直接确定出目标视角的二维目标图像进行显示,并且二维目标图像的复杂度与占用内存较小,因此客户端显示图像的过程不会占用过多的网络资源和内存,从而避免图像显示过程出现卡顿,最终提升了用户的观看体验。

Description

图像显示方法、图像处理方法、装置、设备及介质
本申请要求于2022年8月19日递交的中国专利申请第202210999321.1号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。
技术领域
本公开的实施例涉及一种图像显示方法、图像处理方法、装置、设备及介质。
背景技术
随着计算机技术的不断发展,越来越多的客户端可以显示三维对象模型。为了提高三维对象模型的显示效果以及提升用户的交互体验,客户端需要对三维对象模型在多个视角下的二维图像进行显示。
为了在客户端上显示三维对象模型的多个视角对应的二维图像,一般直接下载并处理三维对象模型,然后将三维对象模型在各视角下的二维图像进行显示。但是,三维对象模型的空间信息比较复杂且占用内存较大,直接处理三维对象模型需要很长时间,导致三维对象模型的显示过程的实时性较差,最终降低了用户的观看体验。
发明内容
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开提供了一种图像显示方法、图像处理方法、装置、设备及介质。
第一方面,本公开提供了一种图像显示方法,应用于客户端,该方法包括:
接收服务器发送的二维图像集合,所述二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;
响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与所述目标视角对应的二维目标图像;
显示所述二维目标图像。
第二方面,本公开提供了一种图像处理方法,应用于服务器,该方法包括:
获取三维对象模型在多个视角生成的二维图像集合,其中,所述二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;
将所述二维图像集合发送给客户端,以使所述客户端解析与目标视角对应的二维目标图像并显示。
第三方面,本公开提供了一种图像显示装置,该装置配置于客户端,该装置包括:
接收模块,用于接收服务器发送的二维图像集合,所述二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;
解析模块,用于响应于对所述三维对象模型在目标视角下的显示指令,从所述二维图像集合中,解析与所述目标视角对应的二维目标图像;
图像显示模块,用于显示所述二维目标图像。
第四方面,本公开提供了一种图像处理装置,该装置配置于服务器,该装置包括:
获取模块,用于获取三维对象模型在多个视角生成的二维图像集合,其中,所述二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;
发送模块,用于将所述二维图像集合发送给客户端,以使所述客户端解析与目标视角对应的二维目标图像并显示。
第五方面,本公开提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当指令在终端设备上运行时,使得终端设备实现上述的方法。
第六方面,本公开提供了一种设备,包括:存储器,处理器,及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时,实现上述的方法。
第七方面,本公开提供了一种计算机程序产品,计算机程序产品包括计算机程序/指令,计算机程序/指令被处理器执行时实现上述的方法。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例,下面将对实施例所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种图像显示方法的流程示意图;
图2为本公开实施例提供的一种图像显示方法的逻辑示意图;
图3为本公开实施例提供的一种图像处理方法的流程示意图;
图4为本公开实施例提供的一种二维图像的获取示意图;
图5为本公开实施例提供的一种图像处理及显示方法的逻辑示意图;
图6为本公开实施例提供的一种图像显示装置的结构示意图;
图7为本公开实施例提供的一种图像处理装置的结构示意图;以及
图8为本公开实施例提供的一种客户端或服务器的结构示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
目前,很多三维对象模型的清晰度以及纹理复杂度很高,在进行图像显示时,三维对象模型的下载以及处理过程对网速以及内存的要求较高,很难满足实时进行图像显示的需求。
为了解决上述问题,本公开实施例提供了一种图像显示方法、装置、设备及介质。其中,该图像显示方法可以应用于客户端。其中,客户端可以包括但不限于手机、平板、笔记本电脑、台式电脑、智能家居、可穿戴设备以及车载设备等。
图1示出了本公开实施例提供的一种图像显示方法的流程示意图。如图 1所示,该图像显示方法包括如下步骤。
S110、接收服务器发送的二维图像集合,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像。
实际应用时,当需要进行图像显示时,客户端从服务器下载并存储二维图像集合,由于二维图像集合记录有三维对象模型在多个不同视角下的二维图像,则二维图像集合可以提供任意视角下的二维图像。
在本公开实施例中,三维对象模型可以是待显示对象的三维模型。
可选地,三维对象模型可以具有颜色特征,也可以具有透明度特征,也可以同时具有颜色特征和透明度特征。
在本公开实施例中,视角可以理解为屏幕视角。具体的,视角的形式可以是经纬度。其中,纬度范围可以是[-90,90],经度范围[0,360]。
在本公开实施例中,二维图像是指三维对象模型在任一视角下的图像。
S120、响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与目标视角对应的二维目标图像。
实际应用时,客户端获取外部输入携带目标视角的显示指令,然后基于二维图像集合中确定目标视角对应的二维目标图像。具体的,若二维图像集合包括多个不同视角对应的二维图像,则可以直接从二维图像集合中查找目标视角对应的二维目标图像;若二维图像集合包括多个不同视角对应的二维图像的编码数据,则对目标视角对应的编码数据进行解码,得到目标视角对应的二维目标图像。
在本公开实施例中,显示指令是用于触发客户端进行图像显示的请求。可选地,显示指令可以由用户触发生成,也可以是在发生应用跳转时由客户端自动生成。
需要说明的是,三维对象模型的几何特征精细程度较高、会经过各种反射与折射,并且三维对象模型的材质丰富且复杂度极高,预先通过服务器生成三维对象模型的多个视角的二维图像,并生成记录在多个不同视角下的二维图像的二维图像集合,再将二维图像集合下载到客户端的过程,无需依赖客户端直接下载以及处理三维对象模型,因此可以很快的获取到目标视角对应的二维目标图像。
S130、显示二维目标图像。
实际应用时,客户端确定出目标视角对应的二维目标图像之后,可以直接显示该二维目标图像。
例如,用户想要在短视频的播放界面上不断切换三维对象模型的二维图像,客户端获取每次切换操作对应的目标视角,并在短视频的播放界面上显示每次切换操作对应的目标视角的二维目标图像。
本公开实施例提供了一种图像显示方法,接收服务器发送的二维图像集合,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与目标视角对应的二维目标图像;显示二维目标图像。通过上述过程,客户端能够从二维图像集合中直接确定出目标视角的二维目标图像进行显示,并且二维目标图像的复杂度与占用内存较小,因此客户端显示图像的过程不会占用过多的网络资源和内存,从而避免图像显示过程出现卡顿,最终提升了用户的观看体验。
在本公开另一种实施方式中,客户端可以从二维图像集合中获取目标视角下的二维目标图像对应的编码数据,并解析编码数据,确定二维目标图像。
在本公开实施例中,可选地,S120具体可以包括如下步骤:
S1201、从二维图像集合中获取在目标视角下的二维目标图像对应的编码数据;
S1202、解析编码数据,获取与目标视角对应的二维目标图像。
具体的,由于二维图像集合包括三维对象模型在多个不同视角下的二维图像对应的编码数据,则客户端首先根据目标视角,从二维图像集合中获取目标视角下的二维目标图像对应的编码数据,然后利用诸如h265解码器解析所述编码数据,得到目标视角对应的二维目标图像。
其中,编码数据可以是每个视角对应的二进制格式的压缩编码产物。具体的,多个不同视角下的二维图像对应的编码数据预先由服务器对不同视角下的二维图像进行压缩编码得到。
由此,客户端只需要从服务器接收的二维图像集合中获取目标视角下的二维目标图像对应的编码数据并解析编码数据,就可以获取目标视角对应的二维目标图像并显示。因此,图像显示过程只用到了客户端的解码能力,兼容性较好,对客户端的要求较低,并且解码过程简单,易于实现且可以批量 生产,因此得到的产物较小,使得客户端下载编码数据的速度较快。
在本公开又一种实施方式中,编码数据中包括不同的属性信息,不同的属性信息用于标记不同信息,使得客户端基于不同的属性信息,显示对应的二维目标图像。
为了保证编码数据的解析准确性以及图像显示准确性。在一些实施例中,编码数据包括所述二维图像的第一属性信息,其中,第一属性信息中的尺寸信息用于标记二维图像的显示尺寸,并根据第一属性信息中的视角信息确定目标视角。
具体的,客户端从第一属性信息中解析到尺寸信息时,可以根据尺寸信息确定二维图像的显示尺寸。其中,显示尺寸是指二维图像的显示大小。可选的,显示尺寸可以包括长(length)、宽(width)、高(height)等。
其中,视角信息能够用于标记二维图像对应的视角。具体的,客户端从第一属性信息中解析到视角信息时,可以根据视角信息确定目标视角。
由此,在本公开实施例中,客户端通过解析编码数据包括的第一属性信息中的视角信息准确的确定二维目标图像,以及基于第一属性信息中的尺寸信息来显示对应尺寸的二维目标图像,保证了编码数据的解析准确性以及图像显示准确性。
为了提高图像显示的可视化效果,可以融合颜色数据和透明数据来生成二维目标图像,以满足用户查看颜色和透明度的需求。在一些实施例中,编码数据包括二维图像的第二属性信息,其中,第二属性信息用于标记二维图像的不同通道在编码数据中的位置信息。
相应的,S1202具体可以包括如下步骤:
S12021、从第二属性信息中,获取二维目标图像的颜色通道在编码数据中的位置信息以及二维目标图像的透明度通道在所述编码数据中的位置信息;
S12022、基于二维目标图像的颜色通道在编码数据中的位置信息,从编码数据中获取与目标视角对应的颜色通道编码;
S12023、基于二维目标图像的透明度通道在编码数据中的位置信息,从编码数据中获取与目标视角对应的透明度通道编码;
S12024、融合颜色通道编码和透明度通道编码,生成目标视角对应的二 维目标图像。
其中,位置信息可以是二维图像的不同通道在编码数据的字节偏移量以及字节长度。也就是说,每个视角对应的二维图像均存在对应的字节偏移量以及字节长度,并且,每个视角对应的二维图像的编码数据包括不同通道的编码数据。由此可知,基于二维目标图像的不同通道在编码数据中的位置信息,可以从编码数据中获取与目标视角对应的不同通道的编码。
其中,颜色通道可以是YUV通道,该颜色通道对应的数据是YUV格式的数据。颜色通道编码是指二维图像的颜色通道对应的编码数据。
具体的,在服务器对多个视角对应的二维图像进行压缩编码的过程中,服务器可以首先获取多张RGB格式的图像,作为多个视角对应的二维图像,然后从RGB格式的图像的YUV通道提取图像数据,得到多个视角对应的二维图像在YUV通道的图像数据,再对多个视角对应的二维图像在YUV通道的图像数据进行压缩编码,得到多个视角对应的二维图像的YUV通道的编码数据,并且每个二维图像的YUV通道在编码数据中均对应位置信息。这样,在客户端到多个视角对应的编码数据之后,可以根据二维目标图像的颜色通道在编码数据中的位置信息,从编码数据中直接获取目标视角对应的颜色通道编码。
其中,透明度通道可以是Alpha通道,该透明度通道对应的数据是灰度数据。透明度通道编码是指二维图像的透明度通道对应的编码数据。
具体的,在服务器对多个视角对应的二维图像进行压缩编码的过程中,服务器可以首先获取多张RGB格式的图像,作为多个视角对应的二维图像,然后从RGB格式的图像的Alpha通道提取图像数据,得到多个视角对应的二维图像在Alpha通道的图像数据,再对多个视角对应的二维图像在Alpha通道的图像数据进行压缩编码,得到多个视角对应的二维图像的Alpha通道的编码数据,并且每个二维图像的Alpha通道在编码数据中均对应位置信息。这样,在客户端到多个视角对应的编码数据之后,可以根据二维目标图像的透明度通道在编码数据中的位置信息,从编码数据中直接获取目标视角对应的透明度通道编码。
进一步的,客户端在获取到颜色通道编码和透明度通道编码之后,可以基于像素点的位置,融合颜色通道编码和透明度通道编码,生成二维目标图 像并显示。由此,客户端能够显示融合有颜色数据和透明数据的二维目标图像,以满足用户查看颜色和透明度的需求。
在其他实施例中,客户端也可以只基于颜色通道编码,生成二维目标图像并显示,或者,客户端还可以只基于透明度通道编码,生成二维目标图像并显示。由此,客户端可以向用户显示单一通道的二维目标图像。
需要说明的是,不同视角的二维图像对应的图像帧类型不同,使得编码数据包括不同的数据帧类型,并且,不同帧类型对应的编码数据的获取方式不同。因此,为了保证查找到所有数据帧类型的编码数据,在一些实施例中,编码数据还包括二维图像的第三属性信息,其中,第三属性信息用于标记二维图像的不同通道在编码数据中的数据帧类型;
相应的,S12022具体可以包括如下步骤:
S10、根据第三属性信息,从编码数据中获取与目标视角对应的颜色通道的数据帧类型;
S11、如果与目标视角对应的颜色通道的数据帧类型是非关键帧类型,则根据目标视角对应的颜色通道的位置信息,从编码数据中获取与目标视角对应的颜色通道的关键帧编码和非关键帧编码,以及根据与目标视角对应的颜色通道的关键帧编码和非关键帧编码,生成与目标视角对应的颜色通道编码;
S12、如果与目标视角对应的颜色通道的数据帧类型是关键帧类型,则根据目标视角对应的颜色通道的位置信息,从编码数据中获取与目标视角对应的颜色通道的关键帧编码,并将与目标视角对应的颜色通道的关键帧编码作为与目标视角对应的颜色通道编码。
相应的,S12023具体可以包括如下步骤:
S20、根据第三属性信息,从编码数据中获取与目标视角对应的透明度通道的数据帧类型;
S21、如果与目标视角对应的透明度通道的数据帧类型是非关键帧类型,则根据目标视角对应的透明度通道的位置信息,从编码数据中获取与目标视角对应的透明度通道的关键帧编码和非关键帧编码,以及根据与目标视角对应的透明度通道的关键帧编码和非关键帧编码,生成与目标视角对应的透明度通道编码;
S22、如果与目标视角对应的透明度的数据帧类型是关键帧类型,则根据 目标视角对应的透明度的位置信息,从编码数据中获取与目标视角对应的透明度通道的关键帧编码,并将与目标视角对应的透明度通道的关键帧编码作为与目标视角对应的透明度通道编码。
其中,数据帧类型是二维图像的帧类型,不同视角对应的二维图像对应唯一的帧类型。二维图像可以包括关键帧类型的图像和非关键帧的图像,则编码数据可以包括关键帧编码和非关键帧编码。
为了提高服务器对二维图像的编码效率,可以将获取的多个视角的二维图像进行分组,针对每个分组内的二维图像,进行压缩编码。具体的,可以利用人工将9个连续间隔的视角对应的二维图像划分为一组,其中,每组二维图像中位于中心位置的一张二维图像被标记为关键帧类型的二维图像,每组二维图像中位于非中心位置的8张二维图像被标记为非关键帧类型的二维图像。在服务器对每组二维图像进行压缩编码时,可以确定每组二维图像中位于中心位置的一张二维图像对应的编码数据为关键帧编码,以及确定每组二维图像中位于非中心位置的8张二维图像对应的编码数据为非关键帧编码。
需要说明的是,在利用服务器对每组二维图像进行压缩编码的过程中,关键帧类型的二维图像对应的压缩编码全部作为关键帧编码,非关键帧类型的二维图像对应的压缩编码的一部分作为关键帧编码,另一部分作为非关键帧编码。因此,对于颜色通道来说,在客户端获取目标视角对应的不同通道下的编码数据时,如果目标视角对应的颜色通道的数据帧类型是非关键帧类型,则需要获取目标视角对应的颜色通道的关键帧编码和非关键帧编码,并根据关键帧编码中的部分编码与非关键帧编码,生成与目标视角对应的颜色通道编码;如果目标视角对应的颜色通道的数据帧类型是关键帧类型,则直接将与目标视角对应的颜色通道的关键帧编码作为与目标视角对应的颜色通道编码。
其中,目标视角对应的关键帧编码是指目标视角所在组内的关键帧的编码数据。目标视角对应的非关键帧编码是指该视角实际对应的编码数据。需要说明的是,确定透明度通道编码的原理与确定颜色通道编码的原理相同,在此不做赘述。
为了便于理解,每组二维图像对应的颜色通道编码和透明度通道编码,可以通过如下方式表示:
其中,上述表格中的每组二维图像对应的颜色通道编码包括一个关键帧(I)编码和8个非关键帧(P)编码,并且,每组二维图像对应的透明度通道编码包括一个关键帧(I)编码和8个非关键帧(P)编码。
由此,对于不同的数据帧类型,可以基于不同的逻辑获取不同数据帧类型的编码数据,对于颜色通道和透明度通道来说,保证了两个通道的编码数据获取的全面性和准确性。
为了方便快速获取到不同视角对应的编码数据,在一些实施例中,编码数据包括协议头,其中,协议头包括第一属性信息、第二属性信息以及第三属性信息中的一种或者多种组合。
为了进一步提高客户端对编码数据的解析效率,客户端可以将编码中协议头存储至本地,然后从本地获取并快速解析编码数据。
在本公开实施例中,可选的,在S1202之前,该方法还包括如下步骤:
基于各视角对应的协议头格式,将各视角对应的协议头存储在预设存储结构中;
相应的,S1202具体可以包括如下步骤:
从预设存储结构中获取目标视角对应的目标协议头,并解析目标协议头对应的编码数据,获取与目标视角对应的二维目标图像。
其中,预设存储结构可以是客户端对应的可以是内存存储结构,具体可以是一种Map结构,Map是一种把键对象和值对象映射的集合,它的每一个元素都包含一对键对象和值对象。
可选地,预设存储结构中的协议头的代码可以是如下结构:
{
"Longitude,Latitude":[
{
"Frame Type":"I/P",
"Offset":"Current Offset",
"Length":"Current Frame Length",
"Alpha_Offset":"Current Frame Alpha Offset",
"Alpha_Length":"Current Frame Alpha length",
"I_Offset":"Reference I Frame Offset",
"I_Length":"Reference I Frame Length",
"Alpha_I_Offset":"Reference I Frame Alpha Offset",
"Alpha_I_Length":"Reference I Frame Alpha Length",
}
}
}
由此,在解码阶段,通过从客户端对应的预设存储结构中解析协议头,能够快速的解析到目标视角对应的编码数据,从而提高目标视角对应的二维目标图像的解析效率,进一步优化的图像显示效率。
为了便于理解从编码数据中获取与目标视角对应的不同通道的编码数据,以及非关键帧类型的编码数据和关键字类型的编码数据的获取逻辑。图2示出了本公开实施例提供的一种图像显示方法的逻辑示意图。
如图2所示,该图像显示方法包括如下过程:
S210、接收服务器发送的二维图像集合,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像。
S220、响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与目标视角对应的二维目标图像。
具体的,在S220之前,客户端可以基于各视角对应的协议头格式,将各视角对应的协议头存储在预设存储结构中,例如存储在map。
S230、从预设存储结构中获取目标视角对应的目标协议头。
S240、解析目标视角对应的目标协议头中的第一属性信息。
其中,第一属性信息中的尺寸信息用于标记二维图像的显示尺寸,并根据第一属性信息中的视角信息确定目标视角。
S250、从目标协议头中的第二属性信息中,获取二维目标图像的颜色通道在编码数据中的位置信息。
其中,第二属性信息用于标记二维图像的不同通道在所述编码数据中的位置信息。
S260、根据目标协议头中的第三属性信息,从编码数据中获取与目标视角对应的颜色通道的数据帧类型。
其中,第三属性信息用于标记二维图像的不同通道在编码数据中的数据帧类型。
S270、确定颜色通道的数据帧类型是否是关键帧类型。
具体的,若颜色通道的数据帧类型是关键帧类型,则执行S280,否则,执行S290。
S280、根据目标视角对应的颜色通道的位置信息,从编码数据中获取与目标视角对应的颜色通道的关键帧编码,并将与目标视角对应的颜色通道的关键帧编码作为与目标视角对应的颜色通道编码。
S290、根据目标视角对应的颜色通道的位置信息,从编码数据中获取与目标视角对应的颜色通道的关键帧编码和非关键帧编码,以及根据与目标视角对应的颜色通道的关键帧编码和非关键帧编码,生成与目标视角对应的颜色通道编码。
S291、从目标协议头中的第二属性信息中,获取二维目标图像的透明度通道在编码数据中的位置信息。
S292、根据目标协议头中的第三属性信息,从编码数据中获取与目标视角对应的透明度通道的数据帧类型。
S293、确定透明度通道的数据帧类型是否是关键帧类型。
具体的,若透明度通道的数据帧类型是关键帧类型,则执行S294,否则,执行S295。
S294、根据目标视角对应的透明度通道的位置信息,从编码数据中获取与目标视角对应的透明度通道的关键帧编码,并将与目标视角对应的透明度通道的关键帧编码作为与目标视角对应的透明度通道编码。
S295、根据目标视角对应的透明度通道的位置信息,从编码数据中获取与目标视角对应的透明度通道的关键帧编码和非关键帧编码,以及根据与目标视角对应的透明度通道的关键帧编码和非关键帧编码,生成与目标视角对应的透明度通道编码。
S296、融合颜色通道编码和透明度通道编码,生成目标视角对应的二维目标图像并显示。
在本公开再一种实施方式中,提供了一种降低网络传输资源以及内存占用的图像处理方法。其中,该图像处理方法可以应用于服务器。其中,该服 务器可以是云服务器或者是服务器集群。
图3示出了本公开实施例提供的一种图像处理方法的流程示意图。如图3所示,该图像处理方法包括如下步骤。
S310、获取三维对象模型在多个视角生成的二维图像集合,其中,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像。
可以理解的是,视角的具体形式可以是经纬度。具体的,可以设置纬度范围[-90,90]且经度范围[0,360],针对三维对象模型,每6度采集一张图像,则可以采集包含360/6*(180/6+1)=1860张二维图像,若将二维图像每18度分成一组,则可以获取(180/18+1)*(360/18)=220组二维图像。具体的,除了南北极点之外,可以上述数量的二维图像按照3*3分为一组,即9张图像合成一组,总共为(180/6-1)*(360/6)=1740张二维图像,并将每组图像中位于中心位置的二维图像设为关键帧类型的图像,将每组图像中位于非中心位置的二维图像设为非关键帧类型的图像;针对南北极点(*,-90),(*,90),则将二维图像按照3*2共6张合成一组,极点为关键帧类型的图像,外围5张非关键帧类型的图像,总共为2*(360/6)=120张。
为了便于理解二维图像的采集过程,图4示出了一种二维图像的获取示意图。
如图4所示,针对三维对象模型,可以每6度采集一张二维图像,例如在(-6,6)、(0,0)、(-6,0)等视角下采集一张二维图像,则可以得到9张二维图像并将9张二维图像作为一组,其中,每组二维图像中位于中心位置的二维图像设为关键帧类型的图像,非中心位置的二维图像设为非关键帧类型的图像,每组二维图像中的关键帧类型的图像可以作为非关键帧类型的图像的参考,则每组二维图像包括1张关键帧类型的图像以及8张非关键帧类型的图像。
S320、将二维图像集合发送给客户端,以使客户端解析与目标视角对应的二维目标图像并显示。
在一些实施例中,服务器可以直接将包括多个视角对应的二维图像的二维图像集合发送至客户端,使得客户端从二维图像中查找目标视角对应的二维目标图像并显示。
在另一些实施例中,服务器可以对多个视角对应的二维图像进行压缩编 码得到的编码数据,得到记录多个不同视角下的二维图像的二维图像集合,并将二维图像集合发送至客户端,以使客户端解析目标视角对应的二维目标图像并显示。
本公开实施例提供了一种图像处理方法,服务器获取三维对象模型在多个视角生成的二维图像集合,其中,二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;将二维图像集合发送给客户端,以使客户端解析与目标视角对应的二维目标图像并显示。通过上述过程,三维对象模型的处理过程在服务器中执行,使得客户端直接获取并显示目标视角对应的二维目标图像即可,因此,客户端显示图像的过程不会占用过多的网络资源和内存,从而避免图像显示过程出现卡顿,最终提升了用户观看图像显示过程的交互体验。
在本公开再一种实施方式中,服务器将多个视角对应的二维图像进行压缩编码后,向客户端发送多个视角对应的编码数据。
在本公开实施例中,可选地,S310具体可以包括如下步骤:
对多个不同视角下的二维图像进行压缩编码,生成多个不同视角下的编码数据,并将多个不同视角下的编码数据作为二维图像集合。
具体的,在服务器生成编码数据的过程中,服务器可以利用诸如h265编码器,对多个不同视角下的二维图像进行压缩编码,生成多个不同视角下的二维图像对应的编码数据,也就是生成二进制文件,并将多个不同视角下的编码数据作为二维图像集合。
可选的,编码数据可以包括参数编码和编码体。其中,参数编码是指二维图像编码之后的参数,编码体是指编码数据的本体。
可选的,参数编码都可以包括视频参数集(VPS_NUT)、序列参数集(SPS_NUT)以及(PPS_NUT)等参数。
为了进一步的缩小编码数据占用的资源,针对每个分组内的二维图像对应的编码数据,可以仅保留一份上述参数编码。在一些情况下,若将9张二维图像作为一组,则每组二维图像对应的编码数据可以通过如下方式表示:

由此,在本公开实施例中,服务器可以将多个不同视角下的二维图像进行压缩编码,生成产物较小的二进制文件,并向客户端二进制文件,使得客户端能够在很短的时间内下载编码数据。
为了便于客户端准确的解析编码数据,编码数据包括多种属性信息。
在一些实施例中,编码数据包括二维图像的第一属性信息,其中,第一属性信息中的尺寸信息用于标记二维图像的显示尺寸,第一属性信息中的视角信息用于标记二维图像的视角。
在另一些实施例中,编码数据还包括二维图像的第二属性信息,其中,第二属性信息用于标记二维图像的不同通道在编码数据中的位置信息,以使客户端根据位置信息从编码数据中获取目标视角对应的二维图像的不同通道的数据。
在又一些实施例中,编码数据还包括二维图像的第三属性信息,其中,第三属性信息用于标记二维图像的不同通道在编码数据中的数据帧类型,以使客户端根据数据帧类型对目标视角对应的二维图像的不同通道的数据进行解码处理。
进一步的,为了方便快速获取到不同视角对应的编码数据,在一些实施例中,服务器还可以对每个视角对应的二维图像的编码数据添加协议头。
具体的,服务器可以基于预设协议头协议,将协议头对应添加到每个视角对应的二维图像的编码数据上,则得到携带协议头的编码数据。
可选的,预设头文件协议可以是自回归(Autoregressive,AR)协议。
其中,协议头可以是包括上述第一属性信息、第二属性信息以及第三属性信息的一种或多种组合。
可选的,协议头具体可以是如下格式:
由此,在编码阶段,协议头能够标记其对应的编码数据的多种信息,对于后续的解码阶段,能够方便客户端快速获取到不同视角对应的编码数据,最终优化的图像显示效率。
在本公开再一种实施方式中,为了便于理解图像处理过程以及图像显示过程,图5示出了一种图像处理及显示方法的逻辑示意图。
如图5所示,该图像处理及展示过程包括如下步骤:
S510、获取三维对象模型在多个视角生成的二维图像。
S520、获取多个视角对应的二维图像在颜色通道的图像数据。
S530、获取多个视角对应的二维图像在透明度通道的图像数据。
S540、对多个视角对应的二维图像在颜色通道的图像数据进行压缩编码,得到多个视角对应的颜色通道编码。
S550、对多个视角对应的二维图像在透明度通道的图像数据进行压缩编码,得到多个视角对应的透明度通道编码。
S560、对多个视角对应的颜色通道编码和多个视角对应的透明度通道编码进行合并,得到多个视角对应的二维图像的编码数据。
S570、对多个视角对应的二维图像的编码数据添加协议头,生成三维对象模型在多个视角生成的二维图像集合,其中,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像。
需要说明的是,S510~S570均由服务器执行。
S580、接收服务器发送的二维图像集合。
S590、响应于对三维对象模型在目标视角下的显示指令,解析目标视角对应的编码数据包括的协议头。
S591、基于协议头中的属性信息,从编码数据中获取与目标视角对应的颜色通道编码。
S592、基于协议头中的属性信息,从编码数据中获取与目标视角对应的透明度通道编码。
S593、将与目标视角对应的颜色通道编码和与目标视角对应的透明度通道编码进行对应融合,得到目标视角对应的二维目标图像并展示。
其中,S580~S593均由服务器执行。
与上述方法实施例基于同一个发明构思,本公开还提供了一种图像显示装置,该装置配置于客户端。参考图6,本公开实施例提供的一种图像显示装置的结构示意图,该图像显示装置600包括:
接收模块601,用于接收服务器发送的二维图像集合,所述二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;
解析模块602,用于响应于对所述三维对象模型在目标视角下的显示指令,从所述二维图像集合中,解析与所述目标视角对应的二维目标图像;
图像显示模块603,用于显示所述二维目标图像。
本公开实施例提供了一种图像显示装置,接收服务器发送的二维图像集合,二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;响应于对三维对象模型在目标视角下的显示指令,从二维图像集合中,解析与目标视角对应的二维目标图像;显示二维目标图像。通过上述过程,客户端 能够从二维图像集合中直接确定出目标视角的二维目标图像进行显示,并且二维目标图像的复杂度与占用内存较小,因此客户端显示图像的过程不会占用过多的网络资源和内存,从而避免图像显示过程出现卡顿,最终提升了用户的观看体验。
一种可选的实施方式中,解析模块602,包括:
获取单元,用于从所述二维图像集合中获取在所述目标视角下的二维目标图像对应的编码数据;
解析单元,用于解析所述编码数据,获取与所述目标视角对应的二维目标图像。
一种可选的实施方式中,编码数据包括所述二维图像的第一属性信息,其中,所述第一属性信息中的尺寸信息用于标记所述二维图像的显示尺寸,并根据所述第一属性信息中的视角信息确定所述目标视角。
一种可选的实施方式中,编码数据包括所述二维图像的第二属性信息,其中,所述第二属性信息用于标记所述二维图像的不同通道在所述编码数据中的位置信息;
相应的,解析单元具体用于,从所述第二属性信息中,获取所述二维目标图像的颜色通道在所述编码数据中的位置信息以及所述二维目标图像的透明度通道在所述编码数据中的位置信息;
基于所述二维目标图像的颜色通道在所述编码数据中的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道编码;
基于所述二维目标图像的透明度通道在所述编码数据中的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道编码;
融合所述颜色通道编码和所述透明度通道编码,生成所述目标视角对应的二维目标图像。
一种可选的实施方式中,所述编码数据还包括所述二维图像的第三属性信息,其中,所述第三属性信息用于标记所述二维图像的不同通道在所述编码数据中的数据帧类型;
相应的,解析单元还用于,根据所述第三属性信息,从所述编码数据中获取与所述目标视角对应的颜色通道的数据帧类型;
如果与所述目标视角对应的颜色通道的数据帧类型是非关键帧类型,则 根据所述目标视角对应的颜色通道的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道的关键帧编码和非关键帧编码,以及根据与所述目标视角对应的颜色通道的关键帧编码和非关键帧编码,生成与所述目标视角对应的颜色通道编码;
如果与所述目标视角对应的颜色通道的数据帧类型是关键帧类型,则根据所述目标视角对应的颜色通道的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道的关键帧编码,并将与所述目标视角对应的颜色通道的关键帧编码作为与所述目标视角对应的颜色通道编码。
一种可选的实施方式中,据解析单元还用于,根据所述第三属性信息,从所述编码数据中获取与所述目标视角对应的透明度通道的数据帧类型;
如果与所述目标视角对应的透明度通道的数据帧类型是非关键帧类型,则根据所述目标视角对应的透明度通道的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道的关键帧编码和非关键帧编码,以及根据与所述目标视角对应的透明度通道的关键帧编码和非关键帧编码,生成与所述目标视角对应的透明度通道编码;
如果与所述目标视角对应的透明度的数据帧类型是关键帧类型,则根据所述目标视角对应的透明度的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道的关键帧编码,并将与所述目标视角对应的透明度通道的关键帧编码作为与所述目标视角对应的透明度通道编码。
一种可选的实施方式中,所述编码数据包括协议头,其中,所述协议头包括第一属性信息、第二属性信息以及第三属性信息中的一种或者多种组合。
与上述方法实施例基于同一个发明构思,本公开还提供了一种图像处理装置,该装置配置于服务器。参考图7,本公开实施例提供的一种图像处理装置的结构示意图,该图像处理装置700包括:
获取模块701,用于获取三维对象模型在多个视角生成的二维图像集合,其中,所述二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;
发送模块702,用于将所述二维图像集合发送给客户端,以使所述客户端解析与目标视角对应的二维目标图像并显示。
本公开实施例提供了一种图像处理装置,服务器获取三维对象模型在多 个视角生成的二维图像集合,其中,二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;将二维图像集合发送给客户端,以使客户端解析与目标视角对应的二维目标图像并显示。通过上述过程,三维对象模型的处理过程在服务器中执行,使得客户端直接获取并显示目标视角对应的二维目标图像即可,因此,客户端显示图像的过程不会占用过多的网络资源和内存,从而避免图像显示过程出现卡顿,最终提升了用户观看图像显示过程的交互体验。
一种可选的实施方式中,发送模块702,包括:
压缩编码单元,用于对所述多个不同视角下的二维图像进行压缩编码,生成所述多个不同视角下的编码数据,并将所述多个不同视角下的编码数据作为所述二维图像集合。
一种可选的实施方式中,所述编码数据包括所述二维图像的第一属性信息,其中,所述第一属性信息中的尺寸信息用于标记所述二维图像的显示尺寸,所述第一属性信息中的视角信息用于标记所述二维图像的视角。
一种可选的实施方式中,所述编码数据还包括所述二维图像的第二属性信息,其中,所述第二属性信息用于标记所述二维图像的不同通道在所述编码数据中的位置信息,以使所述客户端根据所述位置信息从所述编码数据中获取所述目标视角对应的二维图像的不同通道的数据。
一种可选的实施方式中,所述编码数据还包括所述二维图像的第三属性信息,其中,所述第三属性信息用于标记所述二维图像的不同通道在所述编码数据中的数据帧类型,以使所述客户端根据所述数据帧类型对所述目标视角对应的二维图像的不同通道的数据进行解码处理。
除了上述方法和装置以外,本公开实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当指令在终端设备上运行时,使得终端设备实现本公开实施例的图像显示方法或者图像处理方法。
本公开实施例还提供了一种计算机程序产品,计算机程序产品包括计算机程序/指令,计算机程序/指令被处理器执行时实现本公开实施例的图像显示方法或者图像处理方法。
图8示出了本公开实施例提供的一种客户端或者服务器的结构示意图。
如图8所示,该客户端或者服务器可以包括控制器801以及存储有计算 机程序指令的存储器802。
具体地,上述控制器801可以包括中央处理器(CPU),或者特定集成电路(Application Specific Integrated Circuit,ASIC),或者可以被配置成实施本申请实施例的一个或多个集成电路。
存储器802可以包括用于信息或指令的大容量存储器。举例来说而非限制,存储器802可以包括硬盘驱动器(Hard Disk Drive,HDD)、软盘驱动器、闪存、光盘、磁光盘、磁带或通用串行总线(Universal Serial Bus,USB)驱动器或者两个及其以上这些的组合。在合适的情况下,存储器802可包括可移除或不可移除(或固定)的介质。在合适的情况下,存储器802可在综合网关设备的内部或外部。在特定实施例中,存储器802是非易失性固态存储器。在特定实施例中,存储器802包括只读存储器(Read-Only Memory,ROM)。在合适的情况下,该ROM可以是掩模编程的ROM、可编程ROM(Programmable ROM,PROM)、可擦除PROM(Electrical Programmable ROM,EPROM)、电可擦除PROM(Electrically Erasable Programmable ROM,EEPROM)、电可改写ROM(Electrically Alterable ROM,EAROM)或闪存,或者两个或及其以上这些的组合。
控制器801通过读取并执行存储器802中存储的计算机程序指令,以执行本公开实施例所提供的图像显示方法的步骤,或者,执行本公开实施例所提供的图像处理方法的步骤。
在一个示例中,该客户端或服务器还可包括收发器803和总线804。其中,如图8所示,控制器801、存储器802和收发器803通过总线804连接并完成相互间的通信。
总线804包括硬件、软件或两者。举例来说而非限制,总线可包括加速图形端口(Accelerated Graphics Port,AGP)或其他图形总线、增强工业标准架构(Extended Industry Standard Architecture,EISA)总线、前端总线(Front Side BUS,FSB)、超传输(Hyper Transport,HT)互连、工业标准架构(Industrial Standard Architecture,ISA)总线、无限带宽互连、低引脚数(Low Pin Count,LPC)总线、存储器总线、微信道架构(Micro Channel Architecture,MCA)总线、外围控件互连(Peripheral Component Interconnect,PCI)总线、PCI-Express(PCI-X)总线、串行高级技术附件(Serial Advanced Technology  Attachment,SATA)总线、视频电子标准协会局部(Video Electronics Standards Association Local Bus,VLB)总线或其他合适的总线或者两个或更多个以上这些的组合。在合适的情况下,总线804可包括一个或多个总线。尽管本申请实施例描述和示出了特定的总线,但本申请考虑任何合适的总线或互连。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (17)

  1. 一种图像显示方法,应用于客户端,所述方法包括:
    接收服务器发送的二维图像集合,所述二维图像集合用于记录三维对象模型在多个不同视角下的二维图像;
    响应于对所述三维对象模型在目标视角下的显示指令,从所述二维图像集合中,解析与所述目标视角对应的二维目标图像;
    显示所述二维目标图像。
  2. 根据权利要求1所述的图像显示方法,其中,所述从所述二维图像集合中,解析与所述目标视角对应的二维目标图像,包括:
    从所述二维图像集合中获取在所述目标视角下的二维目标图像对应的编码数据;
    解析所述编码数据,获取与所述目标视角对应的二维目标图像。
  3. 根据权利要求2所述的图像显示方法,其中,所述编码数据包括所述二维图像的第一属性信息,其中,所述第一属性信息中的尺寸信息用于标记所述二维图像的显示尺寸,并根据所述第一属性信息中的视角信息确定所述目标视角。
  4. 根据权利要求2所述的图像显示方法,其中,所述编码数据包括所述二维图像的第二属性信息,其中,所述第二属性信息用于标记所述二维图像的不同通道在所述编码数据中的位置信息;
    相应的,所述解析所述编码数据,获取与所述目标视角对应的二维目标图像,包括:
    从所述第二属性信息中,获取所述二维目标图像的颜色通道在所述编码数据中的位置信息以及所述二维目标图像的透明度通道在所述编码数据中的位置信息;
    基于所述二维目标图像的颜色通道在所述编码数据中的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道编码;
    基于所述二维目标图像的透明度通道在所述编码数据中的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道编码;
    融合所述颜色通道编码和所述透明度通道编码,生成所述目标视角对应 的二维目标图像。
  5. 根据权利要求4所述的图像显示方法,其中,所述编码数据还包括所述二维图像的第三属性信息,其中,所述第三属性信息用于标记所述二维图像的不同通道在所述编码数据中的数据帧类型;
    相应的,所述基于所述颜色通道的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道编码,包括:
    根据所述第三属性信息,从所述编码数据中获取与所述目标视角对应的颜色通道的数据帧类型;
    如果与所述目标视角对应的颜色通道的数据帧类型是非关键帧类型,则根据所述目标视角对应的颜色通道的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道的关键帧编码和非关键帧编码,以及根据与所述目标视角对应的颜色通道的关键帧编码和非关键帧编码,生成与所述目标视角对应的颜色通道编码;
    如果与所述目标视角对应的颜色通道的数据帧类型是关键帧类型,则根据所述目标视角对应的颜色通道的位置信息,从所述编码数据中获取与所述目标视角对应的颜色通道的关键帧编码,并将与所述目标视角对应的颜色通道的关键帧编码作为与所述目标视角对应的颜色通道编码。
  6. 根据权利要求5所述的图像显示方法,其中,所述基于所述透明度通道的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道编码,包括:
    根据所述第三属性信息,从所述编码数据中获取与所述目标视角对应的透明度通道的数据帧类型;
    如果与所述目标视角对应的透明度通道的数据帧类型是非关键帧类型,则根据所述目标视角对应的透明度通道的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道的关键帧编码和非关键帧编码,以及根据与所述目标视角对应的透明度通道的关键帧编码和非关键帧编码,生成与所述目标视角对应的透明度通道编码;
    如果与所述目标视角对应的透明度的数据帧类型是关键帧类型,则根据所述目标视角对应的透明度的位置信息,从所述编码数据中获取与所述目标视角对应的透明度通道的关键帧编码,并将与所述目标视角对应的透明度通 道的关键帧编码作为与所述目标视角对应的透明度通道编码。
  7. 根据权利要求2-6任一项所述的图像显示方法,其中,所述编码数据包括协议头,其中,所述协议头包括第一属性信息、第二属性信息以及第三属性信息中的一种或者多种组合。
  8. 一种图像处理方法,应用于服务器,所述图像处理方法包括:
    获取三维对象模型在多个视角生成的二维图像集合,其中,所述二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;
    将所述二维图像集合发送给客户端,以使所述客户端解析与目标视角对应的二维目标图像并显示。
  9. 根据权利要求8所述的图像处理方法,其中,所述获取三维对象模型在多个视角生成的二维图像集合,包括:
    对所述多个不同视角下的二维图像进行压缩编码,生成所述多个不同视角下的编码数据,并将所述多个不同视角下的编码数据作为所述二维图像集合。
  10. 根据权利要求9所述的图像处理方法,其中,所述编码数据包括所述二维图像的第一属性信息,其中,所述第一属性信息中的尺寸信息用于标记所述二维图像的显示尺寸,所述第一属性信息中的视角信息用于标记所述二维图像的视角。
  11. 根据权利要求9或10所述的图像处理方法,其中,所述编码数据还包括所述二维图像的第二属性信息,其中,所述第二属性信息用于标记所述二维图像的不同通道在所述编码数据中的位置信息,以使所述客户端根据所述位置信息从所述编码数据中获取所述目标视角对应的二维图像的不同通道的数据。
  12. 根据权利要求9-11任一项所述的图像处理方法,其中,所述编码数据还包括所述二维图像的第三属性信息,其中,所述第三属性信息用于标记所述二维图像的不同通道在所述编码数据中的数据帧类型,以使所述客户端根据所述数据帧类型对所述目标视角对应的二维图像的不同通道的数据进行解码处理。
  13. 一种图像显示装置,配置于客户端,所述图像显示装置包括:
    接收模块,用于接收服务器发送的二维图像集合,所述二维图像集合用 于记录三维对象模型在多个不同视角下的二维图像;
    解析模块,用于响应于对所述三维对象模型在目标视角下的显示指令,从所述二维图像集合中,解析与所述目标视角对应的二维目标图像;
    图像显示模块,用于显示所述二维目标图像。
  14. 一种图像处理装置,配置于服务器,所述图像处理装置包括:
    获取模块,用于获取三维对象模型在多个视角生成的二维图像集合,其中,所述二维图像集合用于记录所述三维对象模型在多个不同视角下的二维图像;
    发送模块,用于将所述二维图像集合发送给客户端,以使所述客户端解析与目标视角对应的二维目标图像并显示。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,其中,当所述指令在终端设备上运行时,使得所述终端设备实现如权利要求1-7任一项所述的图像显示方法或者8-12任一项所述的图像处理方法。
  16. 一种设备,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时,实现如权利要求1-7任一项所述的图像显示方法或者8-12任一项所述的图像处理方法。
  17. 一种计算机程序产品,包括计算机程序/指令,其中,所述计算机程序/指令被处理器执行时实现如权利要求1-7任一项所述的图像显示方法或者8-12任一项所述的图像处理方法。
PCT/CN2023/113854 2022-08-19 2023-08-18 图像显示方法、图像处理方法、装置、设备及介质 WO2024037643A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210999321.1 2022-08-19
CN202210999321.1A CN117640967A (zh) 2022-08-19 2022-08-19 图像显示方法、图像处理方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2024037643A1 true WO2024037643A1 (zh) 2024-02-22

Family

ID=89940863

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/113854 WO2024037643A1 (zh) 2022-08-19 2023-08-18 图像显示方法、图像处理方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN117640967A (zh)
WO (1) WO2024037643A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006260280A (ja) * 2005-03-17 2006-09-28 Fujitsu Ltd モデルデータ表示プログラム、モデルデータ表示装置およびモデルデータ表示方法
CN107945282A (zh) * 2017-12-05 2018-04-20 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) 基于对抗网络的快速多视角三维合成和展示方法及装置
CN113559498A (zh) * 2021-07-02 2021-10-29 网易(杭州)网络有限公司 三维模型展示方法、装置、存储介质及电子设备
CN114648615A (zh) * 2022-05-24 2022-06-21 四川中绳矩阵技术发展有限公司 目标对象交互式重现的控制方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006260280A (ja) * 2005-03-17 2006-09-28 Fujitsu Ltd モデルデータ表示プログラム、モデルデータ表示装置およびモデルデータ表示方法
CN107945282A (zh) * 2017-12-05 2018-04-20 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) 基于对抗网络的快速多视角三维合成和展示方法及装置
CN113559498A (zh) * 2021-07-02 2021-10-29 网易(杭州)网络有限公司 三维模型展示方法、装置、存储介质及电子设备
CN114648615A (zh) * 2022-05-24 2022-06-21 四川中绳矩阵技术发展有限公司 目标对象交互式重现的控制方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN117640967A (zh) 2024-03-01

Similar Documents

Publication Publication Date Title
US11729365B2 (en) Systems and methods for encoding image files containing depth maps stored as metadata
WO2020010997A1 (zh) 视频帧的提取方法、装置、计算机可读介质及电子设备
JP2019534606A (ja) ライトフィールドデータを使用して場面を表す点群を再構築するための方法および装置
EP3913924B1 (en) 360-degree panoramic video playing method, apparatus, and system
EP2462746A1 (en) Transforming video data in accordance with human visual system feedback metrics
CN107295352B (zh) 一种视频压缩方法、装置、设备及存储介质
WO2023241459A1 (zh) 数据通信方法、系统、电子设备和存储介质
CN115761090A (zh) 特效渲染方法、装置、设备、计算机可读存储介质及产品
WO2024037643A1 (zh) 图像显示方法、图像处理方法、装置、设备及介质
WO2016161899A1 (zh) 一种多媒体信息处理方法、设备及计算机存储介质
CN112312067A (zh) 预监输入视频信号的方法、装置和设备
CN114286194B (zh) 即时通信视频的处理方法、装置、电子设备及存储介质
JP7471731B2 (ja) メディアファイルのカプセル化方法、メディアファイルのカプセル化解除方法及び関連機器
CN113709512A (zh) 直播数据流交互方法、装置、服务器及可读存储介质
CN114760525A (zh) 视频生成及播放方法、装置、设备、介质
CN116489499A (zh) 一种实时展示云手机相机预览画面的方法、装置及计算机设备、储存介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23854538

Country of ref document: EP

Kind code of ref document: A1