WO2023029207A1 - Video data processing method, decoding device, encoding device, and storage medium - Google Patents

Video data processing method, decoding device, encoding device, and storage medium Download PDF

Info

Publication number
WO2023029207A1
WO2023029207A1 PCT/CN2021/129225 CN2021129225W WO2023029207A1 WO 2023029207 A1 WO2023029207 A1 WO 2023029207A1 CN 2021129225 W CN2021129225 W CN 2021129225W WO 2023029207 A1 WO2023029207 A1 WO 2023029207A1
Authority
WO
WIPO (PCT)
Prior art keywords
viewpoint
image
main
target
frame
Prior art date
Application number
PCT/CN2021/129225
Other languages
French (fr)
Chinese (zh)
Inventor
王荣刚
王振宇
高文
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Publication of WO2023029207A1 publication Critical patent/WO2023029207A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images

Definitions

  • the present application relates to the technical field of video data processing, and in particular to a video data processing method, a decoding device, an encoding device and a storage medium.
  • the free viewpoint technology is a technology for viewing videos from a free viewpoint.
  • the current free-viewpoint application using the free-viewpoint technology can allow viewers to watch videos in the form of continuous viewpoints within a certain range.
  • the viewer can set the position and angle of the point of view, and is no longer limited to watching a video shot by a fixed camera angle of view, realizing a 360° free viewing angle to watch the video.
  • the current free-viewpoint applications often use the spatial stitching method to splice single-channel videos from multiple viewpoints together.
  • the free-viewpoint application stitches together multiple viewpoints.
  • Single-channel video is The user displays the single-channel video corresponding to the switched viewpoint.
  • the resolution of the single-channel video from each viewpoint will decrease, resulting in insufficient image resolution for the free-viewpoint application display, resulting in the resolution of the final generated viewpoint images. The rate is not high.
  • the embodiment of the present application provides a video data processing method, a decoding device, a coding device, and a storage medium, aiming to solve the problem of screen resolution required for the display of free-viewpoint applications after splicing single-channel video from multiple viewpoints using the spatial domain splicing method. Insufficient resolution, which in turn leads to the technical problem that the resolution of the final generated viewpoint picture is reduced.
  • An embodiment of the present application provides a video data processing method applied to a decoding device.
  • the video data processing method includes:
  • the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the generated video frame is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint.
  • the image required for the image of the target viewpoint, and the image required for generating the image of the target viewpoint is sent to the display device to generate the target viewpoint picture;
  • the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is generated
  • the required image is sent to the display device to generate the image of the target viewpoint.
  • the step of intercepting the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint includes:
  • the images required for generating the image of the current viewpoint and the images required for generating the image of the target viewpoint both include at least one of a viewpoint frame or a viewpoint depth map frame, and the The resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint.
  • the switching condition includes at least one of the following:
  • the time stamp of the video frame corresponding to the currently displayed image is the same as the time stamp of the video frame in the transmission path corresponding to the target viewpoint;
  • the time stamp of the video frame of the image frame sequence received from the transmission path corresponding to the current viewpoint reaches a preset time point.
  • An embodiment of the present application provides a video data processing method applied to a coding device.
  • the video data processing method includes:
  • each viewpoint is used as a main viewpoint to generate a first image
  • viewpoints other than the main viewpoint are used as corresponding to the main viewpoint generating said second image from a viewpoint from a viewpoint, said image comprising at least one of a viewpoint frame or a viewpoint depth map frame;
  • the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and the spliced video frame corresponding to the main viewpoint is performed according to the shooting time. encoding to generate a corresponding image frame sequence, wherein the resolution of the first image is greater than the resolution of the second image;
  • the decoding device When the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, the image frame sequence corresponding to the current viewpoint is transmitted by the corresponding to the current viewpoint The path is transmitted to the decoding device.
  • the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and all spliced images are processed according to the shooting time
  • the step of encoding the video frame corresponding to the main viewpoint to generate a corresponding image frame sequence includes:
  • the step of encoding the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints includes:
  • the arrangement information of the video frames corresponding to the main viewpoint at least including the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint;
  • Encoding the spliced image sequence and inserting the arrangement information into a sequence header of the encoded spliced image sequence to obtain an image frame sequence corresponding to the main viewpoint.
  • the present application also provides a decoding device, the decoding device includes:
  • the first receiving module is configured to acquire the current viewpoint of the display device according to the viewpoint generation display instruction when receiving the viewpoint generation display instruction sent by the display device;
  • the first sending module is configured to intercept the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and send the image of the current viewpoint to the display device to generate current viewpoint screen;
  • the second receiving module is configured to acquire the target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device, and obtain the sequence of image frames received from the transmission path corresponding to the current viewpoint Intercepting the image required for generating the image of the target viewpoint from the video frame of the target viewpoint, and sending the image required for generating the image of the target viewpoint to the display device to generate a target viewpoint picture, wherein the generating the Both the image required for the image of the current viewpoint and the image required for generating the image of the target viewpoint include at least one of a viewpoint picture or a viewpoint depth map picture, and the resolution of the picture corresponding to the current viewpoint is larger than the The resolution of the picture corresponding to the target viewpoint;
  • the second sending module is configured to intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint when the switching condition is met, and transmit the generated The image required by the image of the target viewpoint is sent to the display device to generate the current viewpoint picture.
  • the present application also provides an encoding device, the encoding device includes:
  • the image acquisition module is configured to acquire images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and viewpoints other than the main viewpoint are used as A secondary viewpoint corresponding to the main viewpoint generates a second image of the secondary viewpoint, and the image includes at least one of a viewpoint frame or a viewpoint depth map frame;
  • the splicing and coding module is configured to splice the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the secondary viewpoint to obtain a video frame corresponding to the main viewpoint, and perform splicing of the spliced main image according to the shooting time. encoding the video frames corresponding to the viewpoint to generate a corresponding sequence of image frames, wherein the resolution of the first image is greater than the resolution of the second image;
  • the data transmission module is configured to convert the image frame sequence corresponding to the current viewpoint into The transmission path corresponding to the current viewpoint is transmitted to the decoding device.
  • the present application also provides a smart device, which includes: a memory, a processor, and a video data processing program stored in the memory and operable on the processor, the When the video data processing program is executed by the processor, the steps of the above video data processing method are realized.
  • the present application also provides a storage medium, the storage medium stores a video data processing program, and when the video data processing program is executed by a processor, the steps of the above video data processing method are implemented.
  • the viewpoint generation display instruction when receiving the viewpoint generation display instruction sent by the display device, the viewpoint generation display instruction is used to obtain the The current viewpoint of the display device; the image required to generate the image of the current viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and the image of the current viewpoint is generated The required image is sent to the display device to generate the current viewpoint picture; when receiving the viewpoint switching instruction sent by the display device, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the corresponding transmission from the current viewpoint Intercepting the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the path, and sending the image required to generate the image of the target viewpoint to the display device to generate the target viewpoint picture; when the switching condition is met, the image required for generating the image of the target viewpoint will be intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint
  • Fig. 1 is a schematic flow chart of the first embodiment of the video data processing method of the present application
  • Fig. 2 is a schematic flow chart of the second embodiment of the video data processing method of the present application.
  • Fig. 3 is a schematic flow chart of the third embodiment of the video data processing method of the present application.
  • FIG. 4 is a schematic flow diagram of a fourth embodiment of the video data processing method of the present application.
  • FIG. 5 is a schematic flow diagram of a fifth embodiment of the video data processing method of the present application.
  • Fig. 6 is the schematic diagram of the video frame switching of the present application.
  • Figure 7 is a schematic diagram of the arrangement of video frames of the present application.
  • FIG. 8 is a schematic flow diagram of multi-viewpoint video data in the decoding device of the present application.
  • FIG. 9 is a schematic flow chart of multi-viewpoint video data in the encoding device of the present application.
  • the embodiment of the present application provides an embodiment of the video data processing method. It should be noted that although the logic sequence is shown in the flowchart, in some cases, the sequence shown or described steps.
  • the video data processing method of the present application includes the following steps:
  • Step S110 when receiving the viewpoint generation and display instruction sent by the display device, acquire the current viewpoint of the display device according to the viewpoint generation and display instruction;
  • Step S120 intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and generating the image required for generating the image of the current viewpoint Send to the display device to generate the current viewpoint picture;
  • Step S130 when receiving the viewpoint switching instruction sent by the display device, obtain the target viewpoint corresponding to the viewpoint switching instruction, and obtain the video frame of the image frame sequence received from the transmission path corresponding to the current viewpoint Intercepting images required for generating the images of the target viewpoint, and sending the images required for generating the images of the target viewpoint to the display device to generate a target viewpoint picture;
  • Step S140 when the switching condition is met, intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and generate the image of the target viewpoint
  • the required image of the image is sent to the display device to generate the image of the target viewpoint.
  • the free viewpoint application allows the viewer to watch the video in the form of continuous viewpoints within a certain range.
  • the viewer can set the position and angle of the viewpoint, and is no longer limited to a fixed camera viewing angle.
  • This application often requires Multiple cameras shoot at the same time and generate video images from multiple viewpoints at the same time.
  • the image corresponding to the current viewpoint is intercepted in real time from the video frame corresponding to the current viewpoint for viewing; in the application scenario of on-demand viewing, Obtain the video frame corresponding to the current viewpoint at the current moment from the image frame sequence and intercept the image corresponding to the current viewpoint to watch; , leading to the technical problem that the resolution of the single-channel video of each viewpoint displayed by the free viewpoint application decreases.
  • This application designs a video data processing method. This method ensures that the free viewpoint can be switched with zero delay.
  • the main viewpoint can also provide a higher picture resolution.
  • the image collected by each camera is the image corresponding to a viewpoint, and one of the viewpoints is used as the main viewpoint, and the other viewpoints are used as slaves.
  • the video frames transmitted in each transmission path are the video frames obtained by encoding the main viewpoint corresponding to the transmission path and the images obtained by splicing other secondary viewpoints except the main viewpoint, that is, the video frames are mainly obtained through the main viewpoint.
  • the viewpoint and the images of the secondary viewpoint are concatenated, and the resolution of the image corresponding to the primary viewpoint is greater than the resolution of the image corresponding to the secondary viewpoint.
  • the current viewpoint is the main viewpoint
  • the decoding device acquires the current viewpoint of the display device according to the viewpoint generation and display instruction when receiving the viewpoint generation and display instruction sent by the display device.
  • the decoding device Analyzing the viewpoint generation and display instruction to obtain the current viewpoint of the display device corresponding to the viewpoint generation and display instruction; after obtaining the current viewpoint of the display device, receiving from the transmission path corresponding to the current viewpoint Intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence, and sending the image required for generating the image of the current viewpoint to the display device to generate the current viewpoint picture, realizing
  • the currently watched video screen is a high-resolution screen;
  • each viewpoint in the video frame is arranged according to a preset arrangement method, and the images corresponding to the viewpoints P1-P10 in Fig.
  • the layout information of each viewpoint in the video frame includes: the coordinates of the pixel in the upper left corner of the corresponding viewpoint image in the video frame, the width and height of the corresponding viewpoint image, the viewpoint number corresponding to the image, etc.;
  • After determining the arrangement of each viewpoint in the video frame acquire the viewpoint identifier corresponding to the main viewpoint and the arrangement information of the video frame, and determine the main viewpoint according to the arrangement information and the viewpoint identifier
  • the image required to generate the image corresponding to the current viewpoint corresponding to the location information is intercepted from the video frame received by the transmission path corresponding to the current viewpoint;
  • the The image required for the image corresponding to the current viewpoint can be a viewpoint picture corresponding to the current viewpoint, or a viewpoint depth map picture corresponding to a virtual viewpoint, and the virtual viewpoint is located between the viewpoints of two cameras and is a fictitious
  • the target viewpoint corresponding to the viewpoint switching instruction is obtained.
  • the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received through the transmission path corresponding to the current viewpoint; wherein, the image required to generate the image of the target viewpoint may be
  • the viewpoint picture corresponding to the target viewpoint may also be a viewpoint depth map picture corresponding to the virtual viewpoint.
  • the image required for the image of the target viewpoint is located in the same video frame as the image required for the image of the current viewpoint intercepted from the video frame received by the transmission path corresponding to the aforementioned current viewpoint, and the The target viewpoint is one of the slave viewpoints, and the resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint; this is because there is a delay in the switching process, and cannot be obtained immediately To switch the video frame corresponding to the transmission path of the target viewpoint to be displayed, resulting in the phenomenon that the display screen is always at a low resolution. Therefore, this application will first intercept the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint An image required to generate an image of the target viewpoint to be switched is displayed at a low resolution.
  • the switching condition when the switching condition is satisfied, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is The image required for the image of the target viewpoint is sent to the display device for the display device to generate the image of the target viewpoint, at this time, the resolution of the image required for the image of the target viewpoint will be displayed from low resolution to high resolution,
  • the switching condition includes at least one of the following: the time stamp of the video frame corresponding to the currently displayed image is the same as the time stamp of the video frame in the transmission path corresponding to the target viewpoint; The time stamp of the video frame of the image frame sequence received by the path reaches a preset time point, and the preset time point can be set according to the actual situation; what needs to be emphasized here is that the transmission path corresponding to the target viewpoint is the same as the aforementioned The transmission path corresponding to the current viewpoint is not the same path, and at the same time, the sequence of image frames received
  • P3 represents the third viewpoint
  • P3_2 is the second time period of the No. 3 viewpoint
  • P3_3 is the third time period of the No. 3 viewpoint
  • P3_2 to P3_3 are normal ones
  • the viewpoint switch occurs during the playback of P2_1. For example, when the playback is halfway through, switch to the A viewpoint.
  • the viewpoint generation and display instruction sent by the display device when the viewpoint generation and display instruction sent by the display device is received, the current viewpoint of the display device is acquired according to the viewpoint generation and display instruction; the transmission path corresponding to the current viewpoint receives the Intercept the image needed to generate the image of the current viewpoint from the video frame of the image frame sequence, and send the image required to generate the image of the current viewpoint to the display device to generate the current viewpoint picture;
  • the viewpoint switching instruction sent by the display device is received, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the target viewpoint is generated by intercepting from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint
  • An image required for an image of a viewpoint sending the image required for generating the image of the target viewpoint to the display device to generate a target viewpoint picture, wherein the image required for generating the image of the current viewpoint and
  • the images required for generating the image of the target viewpoint all include at least one of a viewpoint picture or a viewpoint depth map picture, and the resolution of the picture corresponding to the current
  • Figure 2 is the second embodiment of the present application, based on step S130 of the first embodiment, the second embodiment of the present application includes the following steps:
  • Step S131 acquiring the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frames
  • Step S132 determining the position information of the image in the video frame required for generating the image of the target viewpoint according to the arrangement information and the viewpoint identifier;
  • Step S133 intercepting an image required to generate an image of a target viewpoint corresponding to the position information from a video frame of the sequence of image frames received by the transmission path corresponding to the current viewpoint.
  • the viewpoint identifier is a viewpoint number, which means the number corresponding to each viewpoint;
  • the arrangement information of the video frame is generated by the arrangement of each viewpoint in the video frame based on a preset arrangement method Specifically, each viewpoint in the video frame is arranged according to a preset arrangement method, and the images corresponding to the viewpoints P1-P10 in FIG.
  • the arrangement information of each viewpoint in the video frame includes: the coordinates of the pixel at the upper left corner of the corresponding viewpoint image in the video frame, the width and height of the corresponding viewpoint image, and the viewpoint number corresponding to the image and other information; according to the arrangement information, the image corresponding to the viewpoint in the video frame can be intercepted and displayed; for example, in this embodiment, the image corresponding to the target viewpoint is intercepted and displayed in the video frame received through the transmission path corresponding to the current viewpoint.
  • the specific process of the image is: obtain the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frame, and determine the image corresponding to the target viewpoint in the video frame according to the arrangement information and the viewpoint identifier. For the position information in the frame, the image corresponding to the target viewpoint corresponding to the position information is intercepted from the video frame received by the transmission path corresponding to the current viewpoint for display.
  • this embodiment adopts the method of obtaining the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frame; determining the method for generating the target viewpoint according to the arrangement information and the viewpoint identifier The position information of the image required by the image in the video frame; intercepting and generating the image of the target viewpoint corresponding to the position information from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint.
  • the technical means of obtaining the required image can realize the low-resolution display of the image corresponding to the target viewpoint.
  • Figure 3 is the third embodiment of the present application
  • the third embodiment of the present application comprises the following steps:
  • Step S210 acquiring images of each viewpoint captured by each camera, and images corresponding to different viewpoints captured by different cameras;
  • Step S220 splicing the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint to obtain the video frame corresponding to the main viewpoint, and according to the shooting time, the spliced image corresponding to the main viewpoint
  • the video frame is encoded to generate a corresponding sequence of image frames
  • Step S230 when the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, the sequence of image frames corresponding to the current viewpoint is replaced by the current viewpoint The corresponding transmission path transmits to the decoding device.
  • the application can deploy multiple cameras, and the number of cameras can be set according to the actual situation; images of each viewpoint taken by each camera are obtained, and different cameras take images corresponding to different viewpoints; the images can be various viewpoints
  • the corresponding view point picture may also be the view point depth map picture corresponding to the virtual view point corresponding to each view point; each view point is used as the main view point to generate the first image, and the view points other than the main view point are used as the view points corresponding to the main view point Generate the second image from the viewpoint from the viewpoint.
  • this application can deploy 10 cameras to shoot video, and the cameras shoot around a shooting focus.
  • P1-P10 is the image taken by each camera, and the corresponding number of P1-P10 is 1 -10 viewpoint image.
  • the video frames corresponding to the main viewpoint are encoded and then sent to the display terminal for display.
  • the images collected by each camera are the images corresponding to one viewpoint, and one of them is The viewpoint is the main viewpoint, and other viewpoints are the secondary viewpoints.
  • the video frames transmitted in each transmission path are encoded by splicing images obtained from the primary viewpoint corresponding to the transmission path and other secondary viewpoints except the primary viewpoint.
  • the frame, that is, the video frame is mainly obtained by splicing the images of the main viewpoint and the secondary viewpoint, and the resolution of the image corresponding to the primary viewpoint is greater than the resolution of the image corresponding to the secondary viewpoint.
  • the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain the video frame corresponding to the main viewpoint, and the video frame corresponding to the main viewpoint is sent according to the shooting time
  • a corresponding image frame sequence is generated after encoding by an HEVC general-purpose encoder, wherein the resolution of the first image is greater than the resolution of the second image; for example, as shown in Figure 7, when P2 is the main viewpoint , the resolution corresponding to P2 is 2880*1620, and the resolution corresponding to other slave viewpoints is 960*540, that is, when P2 is used as the main viewpoint, the resolution of P2 viewpoint is greater than the resolution of other slave viewpoints;
  • the encoder needs to use 10 images as the main viewpoint to generate 10 transmission path video frames; each transmission path video frame is composed of an image corresponding to the main viewpoint and other 9 secondary viewpoints except the main viewpoint.
  • Figure 7 is a video frame with the P2 viewpoint as the main viewpoint, where, when the P2 viewpoint is used as the main viewpoint, P1 and P3-P10 are used as the secondary viewpoints other than the P2 main viewpoint , splicing the images of each viewpoint to obtain the video frame corresponding to the main viewpoint of P2; when the image splicing method using other viewpoints as the main viewpoint is the same as the splicing method of the video frame corresponding to the above-mentioned P2 main viewpoint, it will not be repeated here. .
  • the image collected by each camera is the image corresponding to a viewpoint
  • one of the viewpoints is taken as the main viewpoint
  • the other viewpoints are taken as the slave viewpoints
  • the video frames transmitted in each transmission path are corresponding to the transmission path
  • the video frame obtained by encoding the main viewpoint and the images obtained by splicing other secondary viewpoints except the primary viewpoint that is, the video frame is mainly obtained by splicing the images of the primary viewpoint and the secondary viewpoint, and the image corresponding to the primary viewpoint
  • the resolution is greater than the resolution of the image corresponding to the viewpoint, that is, the resolution of the first image is greater than the resolution of the second image; after the encoding is completed, when the decoding device receives the viewpoint generation display instruction sent by the display device, After the viewpoint generation and display instruction acquires the current viewpoint of the display device, the image frame sequence corresponding to the current viewpoint is transmitted to the decoding device through the transmission path corresponding to the current viewpoint; for example, according to the viewpoint generation and display instruction, the obtained After the main viewpoint of the display device is
  • each viewpoint is used as the main viewpoint to generate the first image
  • the main viewpoint is Other viewpoints are used as the secondary viewpoint corresponding to the main viewpoint to generate the second image of the secondary viewpoint
  • the image includes at least one of a viewpoint picture or a viewpoint depth map picture; the first image corresponding to each main viewpoint and The second image corresponding to the main viewpoint is spliced to obtain the video frame corresponding to the main viewpoint, and the spliced video frame corresponding to the main viewpoint is encoded according to the shooting time to generate a corresponding image frame sequence, wherein, The resolution of the first image is greater than the resolution of the second image; when the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, The technical means of transmitting the image frame sequence corresponding to the current viewpoint to the decoding device
  • Figure 4 is the fourth embodiment of the present application, based on step S220 of the third embodiment, the fourth embodiment of the present application includes the following steps:
  • Step S221 splicing the first image corresponding to each main viewpoint and the second image of the secondary viewpoint corresponding to the main viewpoint to obtain a video frame corresponding to the main viewpoint;
  • Step S222 sorting the video frames corresponding to the main viewpoint according to the shooting time to generate a spliced image sequence
  • Step S223 encoding the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints.
  • the watched video frames are actually historical video frames, and the first image corresponding to each main viewpoint and the second image of the secondary viewpoint corresponding to the main viewpoint are spliced
  • the video frames corresponding to the main viewpoint sort the video frames corresponding to the main viewpoint according to the shooting time, and generate a spliced image sequence corresponding to each main viewpoint; after obtaining the spliced image sequence, encode the spliced image sequence
  • an image frame sequence corresponding to each main viewpoint wherein the first frame image in the image frame sequence corresponding to each main viewpoint is encoded as an I frame.
  • viewpoint switching can only be performed at the I frame
  • the start frame is the I frame, also known as the key frame
  • the I frame is an internal picture
  • the I frame is usually the frame of each image.
  • the first frame in the frame sequence is moderately compressed and used as a reference point for random access, which can be regarded as an image.
  • the encoding can be divided into one code stream sliced at 1 second intervals, and a starting frame is inserted every 1 second.
  • the application encodes the first frame image in the image frame sequence corresponding to each main viewpoint as The I frame is used to obtain the video frame corresponding to the main viewpoint corresponding to the viewpoint switching instruction according to the start frame when the viewpoint switching instruction is received.
  • the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain the video frame corresponding to the main viewpoint; according to the shooting time, the The video frames corresponding to the main viewpoints are sorted to generate a spliced image sequence; the spliced image sequences are encoded to obtain the image frame sequences corresponding to each of the main viewpoints, wherein the image frame sequences corresponding to each of the main viewpoints are The first frame of the image is encoded as an I-frame technical means, thereby generating an image frame sequence corresponding to the main viewpoint.
  • Figure 5 is the fifth embodiment of the present application, based on step S223 in the fourth embodiment, the fifth embodiment of the present application includes the following steps:
  • Step S2231 acquiring the arrangement information of the video frame corresponding to the main viewpoint, the arrangement information at least including the viewpoint identifier of each viewpoint and the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint;
  • Step S2232 Encoding the spliced image sequence, and inserting the arrangement information into a sequence header of the encoded spliced image sequence to obtain an image frame sequence corresponding to the main viewpoint.
  • the viewpoint identifier is a viewpoint number, that is, the number corresponding to each viewpoint; the arrangement information of the video frame corresponding to the main viewpoint is each viewpoint based on the preset arrangement method in the video frame Arrangement generated, the preset arrangement method can be set according to the actual situation; specifically, the first image corresponding to each main viewpoint and the second images of all other secondary viewpoints corresponding to the main viewpoint are spliced Obtain the video frames corresponding to the main viewpoint, sort the video frames corresponding to the main viewpoint according to the shooting time, generate a stitched image sequence, and encode the stitched image sequence; at the same time, obtain the arrangement information of the video frames corresponding to the main viewpoint, and The arrangement information is inserted into the sequence header of the coded spliced image sequence, so as to obtain an image frame sequence corresponding to the main viewpoint.
  • each viewpoint in the video frame corresponding to the main viewpoint is arranged according to a preset arrangement method, and the images corresponding to the P1-P10 viewpoints in FIG.
  • the position information in the video frame corresponding to the main viewpoint; the arrangement information is written into the image header in the video frame corresponding to the main viewpoint as user extension information.
  • the arrangement information of the video frames corresponding to the main viewpoint since the arrangement information of the video frames corresponding to the main viewpoint is obtained, the arrangement information at least includes the viewpoint identifier of each viewpoint and the image corresponding to each viewpoint in the main viewpoint. the position information in the video frame; the technique of encoding the mosaic image sequence, and inserting the arrangement information into the sequence header of the encoded mosaic image sequence to obtain the image frame sequence corresponding to the main viewpoint means to generate an image frame sequence corresponding to the main viewpoint.
  • the embodiment of the present application also provides a decoding device, as shown in Figure 8, the decoding device includes a first receiving module 10, a first sending module 20, a second receiving module 30 and a second sending module 40;
  • the first receiving module 10 is configured to acquire the current viewpoint of the display device according to the viewpoint generation and display instruction when receiving the viewpoint generation and display instruction sent by the display device;
  • the first sending module 20 is configured to intercept the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and send the image of the current viewpoint to the display equipment to generate the current viewpoint screen;
  • the second receiving module 30 is configured to acquire the target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device, and obtain the target viewpoint corresponding to the current viewpoint from the transmission path corresponding to the current viewpoint. Intercepting images required to generate the images of the target viewpoint from the video frames of the image frame sequence, and sending the images required to generate the images of the target viewpoint to the display device to generate a target viewpoint picture, wherein the Both the image required for generating the image of the current viewpoint and the image required for generating the image of the target viewpoint include at least one of a viewpoint frame or a viewpoint depth map frame, and the resolution of the frame corresponding to the current viewpoint is Greater than the resolution of the picture corresponding to the target viewpoint;
  • the second sending module 40 is configured to, when the switching condition is met, intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and The images required for generating the image of the target viewpoint are sent to the display device to generate a picture of the current viewpoint.
  • the video stream is decoded to obtain corresponding viewpoint pictures.
  • the embodiment of the present application also provides a coding device. As shown in FIG.
  • the image acquisition module 50 is configured to acquire images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and the main viewpoint is A viewpoint other than the primary viewpoint is used as a secondary viewpoint corresponding to the main viewpoint to generate a second image of the secondary viewpoint, and the image includes at least one of a viewpoint frame or a viewpoint depth map frame;
  • the splicing and encoding module 60 is configured to splice the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the secondary viewpoint to obtain a video frame corresponding to the main viewpoint, and perform splicing according to the shooting time.
  • the video frame corresponding to the main viewpoint is encoded to generate a corresponding image frame sequence, wherein the resolution of the first image is greater than the resolution of the second image;
  • the data transmission module 70 is configured to, when the decoding device receives the viewpoint generation and display instruction sent by the display device, obtain the current viewpoint of the display device according to the viewpoint generation and display instruction, and transfer the image frame corresponding to the current viewpoint to The sequence is transmitted to the decoding device through the transmission path corresponding to the current viewpoint.
  • the viewpoint pictures are encoded to obtain corresponding video streams.
  • an embodiment of the present application also provides a storage medium, the storage medium stores a video data processing program, and when the video data processing program is executed by a processor, each of the above-mentioned video data processing methods is implemented. Steps, and can achieve the same technical effect, in order to avoid repetition, no more details here.
  • the storage medium provided by the embodiment of the present application is the storage medium used to implement the method of the embodiment of the present application, based on the method introduced in the embodiment of the present application, those skilled in the art can understand the specific structure and deformation of the storage medium, Therefore, I will not repeat them here. All computer storage media used in the methods of the embodiments of the present application belong to the intended protection scope of the present application.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
  • the application can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware.
  • the use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

Abstract

Disclosed in the present application are a video data processing method, a decoding device, an encoding device, and a storage medium. The method comprises: when a generation and display instruction is received, obtaining a current viewpoint; intercepting, from a video frame received by a transmission path corresponding to the current viewpoint, an image required for generating an image of the current viewpoint; when a viewpoint switching instruction is received, obtaining a target viewpoint, and intercepting, from the video frame received by the transmission path corresponding to the current viewpoint, an image required for generating an image of the target viewpoint; and when a switching condition is satisfied, intercepting, from a video frame received by a transmission path corresponding to the target viewpoint, the image required for generating the image of the target viewpoint and displaying same.

Description

视频数据处理方法、解码设备、编码设备及存储介质Video data processing method, decoding device, encoding device and storage medium
相关申请related application
本申请要求2021年09月02日申请的,申请号为202111040999.9,名称为“视频数据处理方法、解码设备、编码设备及存储介质”的中国专利申请的优先权,在此将其全文引入作为参考。This application claims the priority of the Chinese patent application filed on September 2, 2021, with application number 202111040999.9, entitled "Video Data Processing Method, Decoding Device, Encoding Device, and Storage Medium", which is hereby incorporated by reference in its entirety .
技术领域technical field
本申请涉及视频数据处理技术领域,尤其涉及一种视频数据处理方法、解码设备、编码设备及存储介质。The present application relates to the technical field of video data processing, and in particular to a video data processing method, a decoding device, an encoding device and a storage medium.
背景技术Background technique
自由视点技术是一种实现自由视角观看视频的技术。目前的应用自由视点技术的自由视点应用可以允许观看者在一定范围内以连续视点的形式观看视频。观看者可以设定视点的位置、角度,而不再局限于只能观看一个固定的摄像机视角拍摄的视频,实现了360°自由视角观看视频。The free viewpoint technology is a technology for viewing videos from a free viewpoint. The current free-viewpoint application using the free-viewpoint technology can allow viewers to watch videos in the form of continuous viewpoints within a certain range. The viewer can set the position and angle of the point of view, and is no longer limited to watching a video shot by a fixed camera angle of view, realizing a 360° free viewing angle to watch the video.
目前的自由视点应用往往使用空域拼接方法将多个视点的单路视频拼接在一起,当用户在自由视点应用端的进行视点切换时,自由视点应用通过拼接在一起的多个视点的单路视频为用户显示所切换视点对应的单路视频。但是,使用空域拼接方法对多个视点的单路视频拼接之后,导致各个视点的单路视频的分辨率下降,从而造成自由视点应用显示所需的画面分辨率不足,导致最终生成的视点画面分辨率不高。The current free-viewpoint applications often use the spatial stitching method to splice single-channel videos from multiple viewpoints together. When the user switches viewpoints on the free-viewpoint application side, the free-viewpoint application stitches together multiple viewpoints. Single-channel video is The user displays the single-channel video corresponding to the switched viewpoint. However, after splicing the single-channel video from multiple viewpoints using the spatial stitching method, the resolution of the single-channel video from each viewpoint will decrease, resulting in insufficient image resolution for the free-viewpoint application display, resulting in the resolution of the final generated viewpoint images. The rate is not high.
技术问题technical problem
本申请实施例通过提供一种视频数据处理方法、解码设备、编码设备及存储介质,旨在解决使用空域拼接方法对多个视点的单路视频拼接之后,导致自由视点应用显示所需的画面分辨率不足,进而导致最终生成的视点画面分辨率下降的技术问题。The embodiment of the present application provides a video data processing method, a decoding device, a coding device, and a storage medium, aiming to solve the problem of screen resolution required for the display of free-viewpoint applications after splicing single-channel video from multiple viewpoints using the spatial domain splicing method. Insufficient resolution, which in turn leads to the technical problem that the resolution of the final generated viewpoint picture is reduced.
技术解决方案technical solution
本申请实施例提供了一种应用于解码设备的视频数据处理方法,所述视频数据处理方法,包括:An embodiment of the present application provides a video data processing method applied to a decoding device. The video data processing method includes:
接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;When receiving the viewpoint generation and display instruction sent by the display device, acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction;
将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备以生成当前视点画面;Intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and sending the image required to generate the image of the current viewpoint to the The above-mentioned display device is used to generate the current viewpoint picture;
在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面;When the viewpoint switching instruction sent by the display device is received, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the generated video frame is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint. The image required for the image of the target viewpoint, and the image required for generating the image of the target viewpoint is sent to the display device to generate the target viewpoint picture;
在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成所述目标视点的图像。When the switching condition is satisfied, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is generated The required image is sent to the display device to generate the image of the target viewpoint.
在一实施例中,从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像的步骤包括:In an embodiment, the step of intercepting the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint includes:
获取所述目标视点对应的视点标识以及所述视频帧的排布信息;Acquiring the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frames;
根据所述排布信息以及所述视点标识确定所述生成所述目标视点的图像所需的图像在所述视频帧中的位置信息;determining, according to the arrangement information and the viewpoint identifier, the position information of the image required for generating the image of the target viewpoint in the video frame;
在由所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述位置信息对应的目标视点的图像所需的图像。Intercepting an image required to generate an image of a target viewpoint corresponding to the position information from a video frame of the sequence of image frames received by the transmission path corresponding to the current viewpoint.
在一实施例中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个,且所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率。In an embodiment, the images required for generating the image of the current viewpoint and the images required for generating the image of the target viewpoint both include at least one of a viewpoint frame or a viewpoint depth map frame, and the The resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint.
在一实施例中,所述切换条件包括以下至少一个:In an embodiment, the switching condition includes at least one of the following:
当前显示的图像对应的视频帧的时间戳与所述目标视点对应的传输路径中的视频帧的时间戳相同;The time stamp of the video frame corresponding to the currently displayed image is the same as the time stamp of the video frame in the transmission path corresponding to the target viewpoint;
所述从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧的时间戳达到预设时间点。The time stamp of the video frame of the image frame sequence received from the transmission path corresponding to the current viewpoint reaches a preset time point.
本申请实施例提供了一种应用于编码设备的视频数据处理方法,所述视频数据处理方法,包括:An embodiment of the present application provides a video data processing method applied to a coding device. The video data processing method includes:
获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;Obtain images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as a main viewpoint to generate a first image, and viewpoints other than the main viewpoint are used as corresponding to the main viewpoint generating said second image from a viewpoint from a viewpoint, said image comprising at least one of a viewpoint frame or a viewpoint depth map frame;
将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;The first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and the spliced video frame corresponding to the main viewpoint is performed according to the shooting time. encoding to generate a corresponding image frame sequence, wherein the resolution of the first image is greater than the resolution of the second image;
在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。When the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, the image frame sequence corresponding to the current viewpoint is transmitted by the corresponding to the current viewpoint The path is transmitted to the decoding device.
在一实施例中,所述将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列的步骤包括:In an embodiment, the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and all spliced images are processed according to the shooting time The step of encoding the video frame corresponding to the main viewpoint to generate a corresponding image frame sequence includes:
将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧;Splicing the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the viewpoint to obtain a video frame corresponding to the main viewpoint;
按照拍摄时间将所述主视点对应的视频帧进行排序,生成拼接图像序列;Sorting the video frames corresponding to the main viewpoint according to the shooting time to generate a spliced image sequence;
对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列,其中,将每个所述主视点对应的图像帧序列中的第一帧图像编码为I帧。Encoding the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints, wherein the first frame image in the image frame sequence corresponding to each of the main viewpoints is encoded as an I frame.
在一实施例中,所述对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列的步骤包括:In an embodiment, the step of encoding the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints includes:
获取所述主视点对应的视频帧的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在主视点对应的视频帧中的位置信息;Obtaining the arrangement information of the video frames corresponding to the main viewpoint, the arrangement information at least including the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint;
对所述拼接图像序列进行编码,并将所述排布信息插入编码后的所述拼接图像序列的序列头中以得到所述主视点对应的图像帧序列。Encoding the spliced image sequence, and inserting the arrangement information into a sequence header of the encoded spliced image sequence to obtain an image frame sequence corresponding to the main viewpoint.
此外,为实现上述目的,本申请还提供了一种解码设备,所述解码设备包括:In addition, in order to achieve the above purpose, the present application also provides a decoding device, the decoding device includes:
第一接收模块,设置为接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;The first receiving module is configured to acquire the current viewpoint of the display device according to the viewpoint generation display instruction when receiving the viewpoint generation display instruction sent by the display device;
第一发送模块,设置为将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取所述当前视点的图像,并将所述当前视点的图像发送至所述显示设备以生成当前视点画面;The first sending module is configured to intercept the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and send the image of the current viewpoint to the display device to generate current viewpoint screen;
第二接收模块,设置为在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面,其中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个,且所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率;The second receiving module is configured to acquire the target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device, and obtain the sequence of image frames received from the transmission path corresponding to the current viewpoint Intercepting the image required for generating the image of the target viewpoint from the video frame of the target viewpoint, and sending the image required for generating the image of the target viewpoint to the display device to generate a target viewpoint picture, wherein the generating the Both the image required for the image of the current viewpoint and the image required for generating the image of the target viewpoint include at least one of a viewpoint picture or a viewpoint depth map picture, and the resolution of the picture corresponding to the current viewpoint is larger than the The resolution of the picture corresponding to the target viewpoint;
第二发送模块,设置为在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成当前视点画面。The second sending module is configured to intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint when the switching condition is met, and transmit the generated The image required by the image of the target viewpoint is sent to the display device to generate the current viewpoint picture.
此外,为实现上述目的,本申请还提供了一种编码设备,所述编码设备包括:In addition, in order to achieve the above purpose, the present application also provides an encoding device, the encoding device includes:
图像获取模块,设置为获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;The image acquisition module is configured to acquire images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and viewpoints other than the main viewpoint are used as A secondary viewpoint corresponding to the main viewpoint generates a second image of the secondary viewpoint, and the image includes at least one of a viewpoint frame or a viewpoint depth map frame;
拼接编码模块,设置为将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;The splicing and coding module is configured to splice the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the secondary viewpoint to obtain a video frame corresponding to the main viewpoint, and perform splicing of the spliced main image according to the shooting time. encoding the video frames corresponding to the viewpoint to generate a corresponding sequence of image frames, wherein the resolution of the first image is greater than the resolution of the second image;
数据传输模块,设置为在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。The data transmission module is configured to convert the image frame sequence corresponding to the current viewpoint into The transmission path corresponding to the current viewpoint is transmitted to the decoding device.
此外,为实现上述目的,本申请还提供了一种智能设备,所述智能设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的视频数据处理程序,所述视频数据处理程序被所述处理器执行时实现上述的视频数据处理方法的步骤。In addition, in order to achieve the above object, the present application also provides a smart device, which includes: a memory, a processor, and a video data processing program stored in the memory and operable on the processor, the When the video data processing program is executed by the processor, the steps of the above video data processing method are realized.
此外,为实现上述目的,本申请还提供了一种存储介质,所述存储介质存储有视频数据处理程序,所述视频数据处理程序被处理器执行时实现上述的视频数据处理方法的步骤。In addition, to achieve the above object, the present application also provides a storage medium, the storage medium stores a video data processing program, and when the video data processing program is executed by a processor, the steps of the above video data processing method are implemented.
有益效果Beneficial effect
本申请实施例中提供的一种视频数据处理方法、解码设备、编码设备及存储介质的技术方案,由于采用了接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备以生成当前视点画面;在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面;在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成所述目标视点的图像技术方案,解决了使用空域拼接方法对多个视点的单路视频拼接之后,导致自由视点应用显示所需的画面分辨率不足,进而导致最终生成的视点画面分辨率下降的技术问题,提高视点画面的显示效果。In the technical solution of a video data processing method, decoding device, encoding device, and storage medium provided in the embodiment of the present application, when receiving the viewpoint generation display instruction sent by the display device, the viewpoint generation display instruction is used to obtain the The current viewpoint of the display device; the image required to generate the image of the current viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and the image of the current viewpoint is generated The required image is sent to the display device to generate the current viewpoint picture; when receiving the viewpoint switching instruction sent by the display device, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the corresponding transmission from the current viewpoint Intercepting the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the path, and sending the image required to generate the image of the target viewpoint to the display device to generate the target viewpoint picture; when the switching condition is met, the image required for generating the image of the target viewpoint will be intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint will be The image required by the image is sent to the display device to generate the image technical solution of the target viewpoint, which solves the problem that after the single-channel video stitching of multiple viewpoints is stitched using the spatial domain splicing method, the screen resolution required for the display of the free viewpoint application is solved. Insufficient, which in turn leads to the technical problem of a decrease in the resolution of the final generated viewpoint image, and improves the display effect of the viewpoint image.
附图说明Description of drawings
图1为本申请视频数据处理方法第一实施例的流程示意图;Fig. 1 is a schematic flow chart of the first embodiment of the video data processing method of the present application;
图2为本申请视频数据处理方法第二实施例的流程示意图;Fig. 2 is a schematic flow chart of the second embodiment of the video data processing method of the present application;
图3为本申请视频数据处理方法第三实施例的流程示意图;Fig. 3 is a schematic flow chart of the third embodiment of the video data processing method of the present application;
图4为本申请视频数据处理方法第四实施例的流程示意图;FIG. 4 is a schematic flow diagram of a fourth embodiment of the video data processing method of the present application;
图5为本申请视频数据处理方法第五实施例的流程示意图;5 is a schematic flow diagram of a fifth embodiment of the video data processing method of the present application;
图6为本申请视频帧切换的示意图;Fig. 6 is the schematic diagram of the video frame switching of the present application;
图7为本申请视频帧排布方式的示意图;Figure 7 is a schematic diagram of the arrangement of video frames of the present application;
图8为本申请解码设备中多视点视频数据的流程示意图;FIG. 8 is a schematic flow diagram of multi-viewpoint video data in the decoding device of the present application;
图9为本申请编码设备中多视点视频数据的流程示意图。FIG. 9 is a schematic flow chart of multi-viewpoint video data in the encoding device of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明,上述附图只是一个实施例图,而不是本申请的全部。The realization of the object of the present application, functional characteristics and advantages will be further described in conjunction with the embodiments, with reference to the accompanying drawings. The above-mentioned accompanying drawings are only a diagram of an embodiment, rather than the entirety of the present application.
本发明的实施方式Embodiments of the present invention
为了更好的理解上述技术方案,下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。In order to better understand the above-mentioned technical solutions, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
本申请实施例提供了视频数据处理方法的实施例,需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The embodiment of the present application provides an embodiment of the video data processing method. It should be noted that although the logic sequence is shown in the flowchart, in some cases, the sequence shown or described steps.
如图1所示,在本申请的第一实施例中,本申请的视频数据处理方法,包括以下步骤:As shown in Figure 1, in the first embodiment of the present application, the video data processing method of the present application includes the following steps:
步骤S110,接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;Step S110, when receiving the viewpoint generation and display instruction sent by the display device, acquire the current viewpoint of the display device according to the viewpoint generation and display instruction;
步骤S120,将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备以生成当前视点画面;Step S120, intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and generating the image required for generating the image of the current viewpoint Send to the display device to generate the current viewpoint picture;
步骤S130,在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面;Step S130, when receiving the viewpoint switching instruction sent by the display device, obtain the target viewpoint corresponding to the viewpoint switching instruction, and obtain the video frame of the image frame sequence received from the transmission path corresponding to the current viewpoint Intercepting images required for generating the images of the target viewpoint, and sending the images required for generating the images of the target viewpoint to the display device to generate a target viewpoint picture;
步骤S140,在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成所述目标视点的图像。Step S140, when the switching condition is met, intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and generate the image of the target viewpoint The required image of the image is sent to the display device to generate the image of the target viewpoint.
在本实施例中,自由视点应用允许观看者在一定范围内以连续视点的形式观看视频,观看者可以设定视点的位置、角度,而不再局限于一个固定的摄像机视角,该应用往往需要多个摄像机同时拍摄,同时生成多个视点的视频画面,在直播观看的应用场景下,实时从当前视点对应的视频帧中截取当前视点对应的图像进行观看;在点播观看的应用场景下时,从图像帧序列中获取当前时刻所述当前视点对应的视频帧中截取当前视点对应的图像进行观看;本申请为解决现有技术中存在的使用空域拼接方法对多个视点的单路视频拼接之后,导致自由视点应用显示的各个视点的单路视频的分辨率下降的技术问题,本申请设计了一种视频数据处理方法,该方法在保证自由视点能够零延迟切换的同时,对于长时间观看的主视点,又能提供较高的画面分辨率。In this embodiment, the free viewpoint application allows the viewer to watch the video in the form of continuous viewpoints within a certain range. The viewer can set the position and angle of the viewpoint, and is no longer limited to a fixed camera viewing angle. This application often requires Multiple cameras shoot at the same time and generate video images from multiple viewpoints at the same time. In the application scenario of live viewing, the image corresponding to the current viewpoint is intercepted in real time from the video frame corresponding to the current viewpoint for viewing; in the application scenario of on-demand viewing, Obtain the video frame corresponding to the current viewpoint at the current moment from the image frame sequence and intercept the image corresponding to the current viewpoint to watch; , leading to the technical problem that the resolution of the single-channel video of each viewpoint displayed by the free viewpoint application decreases. This application designs a video data processing method. This method ensures that the free viewpoint can be switched with zero delay. The main viewpoint can also provide a higher picture resolution.
在本实施例中,本申请可部署多台摄像机,通过将不同摄像机采集的图像进行拼接,每个摄像机采集的图像即为一个视点对应的图像,将其中一个视点作为主视点,其他视点作为从视点,在每个传输路径中传输的视频帧均为该传输路径对应的主视点与除主视点之外的其他从视点拼接后得到的图像进行编码得到的视频帧,即视频帧主要是通过主视点和从视点的图像拼接得到,且所述主视点对应的图像的分辨率大于所述从视点对应的图像的分辨率。In this embodiment, multiple cameras can be deployed in this application. By splicing the images collected by different cameras, the image collected by each camera is the image corresponding to a viewpoint, and one of the viewpoints is used as the main viewpoint, and the other viewpoints are used as slaves. Viewpoint, the video frames transmitted in each transmission path are the video frames obtained by encoding the main viewpoint corresponding to the transmission path and the images obtained by splicing other secondary viewpoints except the main viewpoint, that is, the video frames are mainly obtained through the main viewpoint. The viewpoint and the images of the secondary viewpoint are concatenated, and the resolution of the image corresponding to the primary viewpoint is greater than the resolution of the image corresponding to the secondary viewpoint.
在本实施例中,所述当前视点为主视点,解码设备在接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取显示设备的当前视点,具体的,所述解码设备对所述视点生成显示指令进行解析以获取所述视点生成显示指令对应的所述显示设备的当前视点;在获取到所述显示设备的当前视点后,从所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备上以生成当前视点画面,实现当前观看的视频画面为高分辨率画面;In this embodiment, the current viewpoint is the main viewpoint, and the decoding device acquires the current viewpoint of the display device according to the viewpoint generation and display instruction when receiving the viewpoint generation and display instruction sent by the display device. Specifically, the decoding device Analyzing the viewpoint generation and display instruction to obtain the current viewpoint of the display device corresponding to the viewpoint generation and display instruction; after obtaining the current viewpoint of the display device, receiving from the transmission path corresponding to the current viewpoint Intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence, and sending the image required for generating the image of the current viewpoint to the display device to generate the current viewpoint picture, realizing The currently watched video screen is a high-resolution screen;
在本实施例中,具体的,在得到截取生成所述当前视点的图像所需的图像之前,需要确定所述当前视点对应的传输路径接收到的图像帧序列的视频帧中各个视点的排布方式,所述视频帧中各个视点按照预设的排布方式进行排布,采用所述排布方式可将图7中P1-P10视点对应的图像拼接到同一个视频帧中,并生成所述视频帧中每个视点的排布信息,每个视点的排布信息包括:对应的视点图像左上角像素在视频帧中的坐标,对应的视点图像的宽高、图像对应的视点编号等信息;在确定所述视频帧中各个视点的排布方式之后,获取所述主视点对应的视点标识以及所述视频帧的排布信息,根据所述排布信息以及所述视点标识确定所述主视点对应的图像在所述视频帧中的位置信息,在由所述当前视点对应的传输路径接收到的视频帧中截取生成所述位置信息对应的当前视点对应的图像所需的图像;其中,所述当前视点对应的图像所需的图像可以是当前视点对应的视点画面,也可以是虚拟视点对应的视点深度图画面,所述虚拟视点位于两台摄像机视点之间的视点,为虚构的视点。In this embodiment, specifically, before obtaining the image required to intercept and generate the image of the current viewpoint, it is necessary to determine the arrangement of each viewpoint in the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint method, each viewpoint in the video frame is arranged according to a preset arrangement method, and the images corresponding to the viewpoints P1-P10 in Fig. 7 can be spliced into the same video frame by using the arrangement method, and the described The layout information of each viewpoint in the video frame, the layout information of each viewpoint includes: the coordinates of the pixel in the upper left corner of the corresponding viewpoint image in the video frame, the width and height of the corresponding viewpoint image, the viewpoint number corresponding to the image, etc.; After determining the arrangement of each viewpoint in the video frame, acquire the viewpoint identifier corresponding to the main viewpoint and the arrangement information of the video frame, and determine the main viewpoint according to the arrangement information and the viewpoint identifier For the location information of the corresponding image in the video frame, the image required to generate the image corresponding to the current viewpoint corresponding to the location information is intercepted from the video frame received by the transmission path corresponding to the current viewpoint; wherein, the The image required for the image corresponding to the current viewpoint can be a viewpoint picture corresponding to the current viewpoint, or a viewpoint depth map picture corresponding to a virtual viewpoint, and the virtual viewpoint is located between the viewpoints of two cameras and is a fictitious viewpoint.
在本实施例中,在接收到显示设备发送的视点切换指令后,获取所述视点切换指令对应的目标视点,此时,因为无法马上实现目标视点切换并得到目标视点对应的传输路径的视频帧,因此,通过当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像;其中,所述生成所述目标视点的图像所需的图像可以是目标视点对应的视点画面,也可以是虚拟视点对应的视点深度图画面。In this embodiment, after receiving the viewpoint switching instruction sent by the display device, the target viewpoint corresponding to the viewpoint switching instruction is obtained. At this time, because the target viewpoint cannot be switched immediately and the video frame of the transmission path corresponding to the target viewpoint cannot be obtained , therefore, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received through the transmission path corresponding to the current viewpoint; wherein, the image required to generate the image of the target viewpoint may be The viewpoint picture corresponding to the target viewpoint may also be a viewpoint depth map picture corresponding to the virtual viewpoint.
在此需要强调的是,所述目标视点的图像所需的图像与前述当前视点对应的传输路径接收到的视频帧中截取的当前视点的图像所需的图像位于同一张视频帧中,且所述目标视点为从视点中的一个,所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率;这是因为切换的过程存在延时的现象,切换的时候无法立即获取所要切换显示的目标视点的传输路径对应的视频帧,从而导致显示画面一直处于分辨率低的现象,因此,本申请会先以当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所要切换的目标视点的图像所需的图像进行低分辨率显示。What needs to be emphasized here is that the image required for the image of the target viewpoint is located in the same video frame as the image required for the image of the current viewpoint intercepted from the video frame received by the transmission path corresponding to the aforementioned current viewpoint, and the The target viewpoint is one of the slave viewpoints, and the resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint; this is because there is a delay in the switching process, and cannot be obtained immediately To switch the video frame corresponding to the transmission path of the target viewpoint to be displayed, resulting in the phenomenon that the display screen is always at a low resolution. Therefore, this application will first intercept the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint An image required to generate an image of the target viewpoint to be switched is displayed at a low resolution.
在本实施例中,在满足切换条件时,从目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至显示设备以供显示设备生成所述目标视点的图像,此时,所述目标视点的图像所需的图像的分辨率会从低分辨率变为高分辨率显示,其中,所述切换条件包括以下至少一个:当前显示的图像对应的视频帧的时间戳与所述目标视点对应的传输路径中的视频帧的时间戳相同;所述从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧的时间戳到达预设时间点,所述预设时间点可根据实际情况进行设置;在此需要强调的是,目标视点对应的传输路径与前述的当前视点对应的传输路径不是同一条路径,同时,目标视点对应的传输路径接收到的图像帧序列与前述的当前视点对应的传输路径接收到的图像帧序列也不是同一个图像帧序列。In this embodiment, when the switching condition is satisfied, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is The image required for the image of the target viewpoint is sent to the display device for the display device to generate the image of the target viewpoint, at this time, the resolution of the image required for the image of the target viewpoint will be displayed from low resolution to high resolution, Wherein, the switching condition includes at least one of the following: the time stamp of the video frame corresponding to the currently displayed image is the same as the time stamp of the video frame in the transmission path corresponding to the target viewpoint; The time stamp of the video frame of the image frame sequence received by the path reaches a preset time point, and the preset time point can be set according to the actual situation; what needs to be emphasized here is that the transmission path corresponding to the target viewpoint is the same as the aforementioned The transmission path corresponding to the current viewpoint is not the same path, and at the same time, the sequence of image frames received by the transmission path corresponding to the target viewpoint is not the same sequence of image frames received by the transmission path corresponding to the aforementioned current viewpoint.
在本实施例中,以图6为例,P3表示第3号视点,P3_2是3号视点的第二个时间段,P3_3是3号视点的第三个时间段,P3_2到P3_3是正常的一个时间段播放完毕后播放下一个时间段的内容,并没有切换视点,切换视点是发生在P2_1播放的过程中,例如播放到一半的时候,切换到A视点,此时P2_1对应时间段内,并没有A视点的高分辨率画面,所以P2_1对应时间段内剩余时间只能从merge stream中拿到A视点的低分辨率画面;当播放进度到达P3_2对应的时间段的时候,从P3_2中可以得到A视点的高分辨率画面,所以P3_2开始,显示的A视点高分辨率画面。In this embodiment, taking Figure 6 as an example, P3 represents the third viewpoint, P3_2 is the second time period of the No. 3 viewpoint, P3_3 is the third time period of the No. 3 viewpoint, and P3_2 to P3_3 are normal ones After the time period is played, the content of the next time period is played, and the viewpoint is not switched. The viewpoint switch occurs during the playback of P2_1. For example, when the playback is halfway through, switch to the A viewpoint. There is no high-resolution picture of viewpoint A, so the remaining time in the corresponding time period of P2_1 can only get the low-resolution picture of viewpoint A from the merge stream; when the playback progress reaches the time period corresponding to P3_2, it can be obtained from P3_2 The high-resolution picture of the A viewpoint, so starting from P3_2, the high-resolution picture of the A viewpoint is displayed.
本实施例根据上述技术方案,由于采用了接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备以生成当前视点画面;在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面,其中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个,且所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率;在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成所述目标视点的图像技术方案,解决了使用空域拼接方法对多个视点的单路视频拼接之后,导致自由视点应用显示所需的画面分辨率不足,进而导致最终生成的视点画面分辨率下降的技术问题,提高视点画面的显示效果。In this embodiment, according to the above-mentioned technical solution, when the viewpoint generation and display instruction sent by the display device is received, the current viewpoint of the display device is acquired according to the viewpoint generation and display instruction; the transmission path corresponding to the current viewpoint receives the Intercept the image needed to generate the image of the current viewpoint from the video frame of the image frame sequence, and send the image required to generate the image of the current viewpoint to the display device to generate the current viewpoint picture; When the viewpoint switching instruction sent by the display device is received, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the target viewpoint is generated by intercepting from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint An image required for an image of a viewpoint, sending the image required for generating the image of the target viewpoint to the display device to generate a target viewpoint picture, wherein the image required for generating the image of the current viewpoint and The images required for generating the image of the target viewpoint all include at least one of a viewpoint picture or a viewpoint depth map picture, and the resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint; When the switching condition is satisfied, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is generated The required image is sent to the display device to generate the image technical solution of the target viewpoint, which solves the problem of insufficient screen resolution required for the free viewpoint application display after the single-channel video stitching of multiple viewpoints using the spatial domain splicing method, This further leads to the technical problem that the resolution of the finally generated viewpoint picture is reduced, and the display effect of the viewpoint picture is improved.
如图2所示,图2为本申请的第二实施例,基于第一实施例步骤S130,本申请的第二实施例包括以下步骤:As shown in Figure 2, Figure 2 is the second embodiment of the present application, based on step S130 of the first embodiment, the second embodiment of the present application includes the following steps:
步骤S131,获取所述目标视点对应的视点标识以及所述视频帧的排布信息;Step S131, acquiring the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frames;
步骤S132,根据所述排布信息以及所述视点标识确定所述生成所述目标视点的图像所需的图像在所述视频帧中的位置信息;Step S132, determining the position information of the image in the video frame required for generating the image of the target viewpoint according to the arrangement information and the viewpoint identifier;
步骤S133,在由所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述位置信息对应的目标视点的图像所需的图像。Step S133 , intercepting an image required to generate an image of a target viewpoint corresponding to the position information from a video frame of the sequence of image frames received by the transmission path corresponding to the current viewpoint.
在本实施例中,所述视点标识为视点编号,即表示每个视点对应的编号;所述视频帧的排布信息为各个视点基于预设排布方式在所述视频帧的排布所生成的;具体的,所述视频帧中各个视点按照预设的排布方式进行排布,采用所述排布方式可将图7中P1-P10视点对应的图像拼接到同一个视频帧中,并生成所述视频帧中每个视点的排布信息,每个视点的排布信息包括:对应的视点图像左上角像素在视频帧中的坐标,对应的视点图像的宽高、图像对应的视点编号等信息;根据所述排布信息可截取所述视频帧中视点对应的图像进行显示;例如,本实施例在通过当前视点对应的传输路径接收到的视频帧中截取并显示所述目标视点对应的图像的具体的过程为:获取所述目标视点对应的视点标识以及所述视频帧的排布信息,根据所述排布信息以及所述视点标识确定所述目标视点对应的图像在所述视频帧中的位置信息,在由所述当前视点对应的传输路径接收到的视频帧中截取所述位置信息对应的所述目标视点对应的图像进行显示。In this embodiment, the viewpoint identifier is a viewpoint number, which means the number corresponding to each viewpoint; the arrangement information of the video frame is generated by the arrangement of each viewpoint in the video frame based on a preset arrangement method Specifically, each viewpoint in the video frame is arranged according to a preset arrangement method, and the images corresponding to the viewpoints P1-P10 in FIG. 7 can be spliced into the same video frame by using the arrangement method, and Generate the arrangement information of each viewpoint in the video frame, and the arrangement information of each viewpoint includes: the coordinates of the pixel at the upper left corner of the corresponding viewpoint image in the video frame, the width and height of the corresponding viewpoint image, and the viewpoint number corresponding to the image and other information; according to the arrangement information, the image corresponding to the viewpoint in the video frame can be intercepted and displayed; for example, in this embodiment, the image corresponding to the target viewpoint is intercepted and displayed in the video frame received through the transmission path corresponding to the current viewpoint. The specific process of the image is: obtain the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frame, and determine the image corresponding to the target viewpoint in the video frame according to the arrangement information and the viewpoint identifier. For the position information in the frame, the image corresponding to the target viewpoint corresponding to the position information is intercepted from the video frame received by the transmission path corresponding to the current viewpoint for display.
本实施例根据上述技术方案,由于采用了获取所述目标视点对应的视点标识以及所述视频帧的排布信息;根据所述排布信息以及所述视点标识确定所述生成所述目标视点的图像所需的图像在所述视频帧中的位置信息;在由所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述位置信息对应的目标视点的图像所需的图像的技术手段,实现获取目标视点对应的图像进行低分辨显示。According to the above technical solution, this embodiment adopts the method of obtaining the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frame; determining the method for generating the target viewpoint according to the arrangement information and the viewpoint identifier The position information of the image required by the image in the video frame; intercepting and generating the image of the target viewpoint corresponding to the position information from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint The technical means of obtaining the required image can realize the low-resolution display of the image corresponding to the target viewpoint.
如图3所示,图3为本申请的第三实施例,本申请的第三实施例包括以下步骤:As shown in Figure 3, Figure 3 is the third embodiment of the present application, the third embodiment of the present application comprises the following steps:
步骤S210,获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像;Step S210, acquiring images of each viewpoint captured by each camera, and images corresponding to different viewpoints captured by different cameras;
步骤S220,将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列;Step S220, splicing the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint to obtain the video frame corresponding to the main viewpoint, and according to the shooting time, the spliced image corresponding to the main viewpoint The video frame is encoded to generate a corresponding sequence of image frames;
步骤S230,在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。Step S230, when the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, the sequence of image frames corresponding to the current viewpoint is replaced by the current viewpoint The corresponding transmission path transmits to the decoding device.
在本实施例中,本申请可部署多台摄像机,摄像机的数量可根据实际情况进行设置;获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像;所述图像可以是各个视点对应的视点画面,也可以是各个视点对应的虚拟视点对应的视点深度图画面;将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像例如,本申请可部署10台摄像机拍摄视频,相机围绕一个拍摄焦点进行拍摄,P1-P10为每台摄像机拍摄的图像,且P1-P10对应编号为1-10的视点图像。In this embodiment, the application can deploy multiple cameras, and the number of cameras can be set according to the actual situation; images of each viewpoint taken by each camera are obtained, and different cameras take images corresponding to different viewpoints; the images can be various viewpoints The corresponding view point picture may also be the view point depth map picture corresponding to the virtual view point corresponding to each view point; each view point is used as the main view point to generate the first image, and the view points other than the main view point are used as the view points corresponding to the main view point Generate the second image from the viewpoint from the viewpoint. For example, this application can deploy 10 cameras to shoot video, and the cameras shoot around a shooting focus. P1-P10 is the image taken by each camera, and the corresponding number of P1-P10 is 1 -10 viewpoint image.
在本实施例中,通过将不同摄像机采集的图像进行拼接,将主视点对应的视频帧进行编码后发送至显示终端进行显示,每个摄像机采集的图像即为一个视点对应的图像,将其中一个视点作为主视点,其他视点作为从视点,在每个传输路径中传输的视频帧均为该传输路径对应的主视点与除主视点之外的其他从视点拼接后得到的图像进行编码得到的视频帧,即视频帧主要是通过主视点和从视点的图像拼接得到,且所述主视点对应的图像的分辨率大于所述从视点对应的图像的分辨率。In this embodiment, by splicing the images collected by different cameras, the video frames corresponding to the main viewpoint are encoded and then sent to the display terminal for display. The images collected by each camera are the images corresponding to one viewpoint, and one of them is The viewpoint is the main viewpoint, and other viewpoints are the secondary viewpoints. The video frames transmitted in each transmission path are encoded by splicing images obtained from the primary viewpoint corresponding to the transmission path and other secondary viewpoints except the primary viewpoint. The frame, that is, the video frame is mainly obtained by splicing the images of the main viewpoint and the secondary viewpoint, and the resolution of the image corresponding to the primary viewpoint is greater than the resolution of the image corresponding to the secondary viewpoint.
在本实施例中,将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,根据拍摄时间将主视点对应的视频帧发送至HEVC通用编码器进行编码后生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;例如,如图7所示,当以P2为主视点时,P2对应的分辨率为2880*1620,其余从视点对应的分辨率为960*540,即在以P2作为主视点时,P2视点的分辨率大于其他从视点的分辨率;此时,在编码的过程中,编码端需要分别以10张图像作为主视点,生成10个传输路径视频帧;每个传输路径视频帧由一个主视点对应的图像以及除主视点之外的其他9个从视点对应的图像拼接编码得到,例如,图7为一张以P2视点为主视点的视频帧,其中,当以P2视点作为主视点时,将P1、P3-P10作为除P2主视点之外的从视点,将各个视点的图像进行拼接,以得到P2主视点对应的视频帧;当以其他视点作为主视点的图像拼接方式与上述P2主视点对应给的视频帧的拼接方式相同,在此不再赘述。In this embodiment, the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain the video frame corresponding to the main viewpoint, and the video frame corresponding to the main viewpoint is sent according to the shooting time A corresponding image frame sequence is generated after encoding by an HEVC general-purpose encoder, wherein the resolution of the first image is greater than the resolution of the second image; for example, as shown in Figure 7, when P2 is the main viewpoint , the resolution corresponding to P2 is 2880*1620, and the resolution corresponding to other slave viewpoints is 960*540, that is, when P2 is used as the main viewpoint, the resolution of P2 viewpoint is greater than the resolution of other slave viewpoints; In the process, the encoder needs to use 10 images as the main viewpoint to generate 10 transmission path video frames; each transmission path video frame is composed of an image corresponding to the main viewpoint and other 9 secondary viewpoints except the main viewpoint. For example, Figure 7 is a video frame with the P2 viewpoint as the main viewpoint, where, when the P2 viewpoint is used as the main viewpoint, P1 and P3-P10 are used as the secondary viewpoints other than the P2 main viewpoint , splicing the images of each viewpoint to obtain the video frame corresponding to the main viewpoint of P2; when the image splicing method using other viewpoints as the main viewpoint is the same as the splicing method of the video frame corresponding to the above-mentioned P2 main viewpoint, it will not be repeated here. .
在本实施例中,每个摄像机采集的图像即为一个视点对应的图像,将其中一个视点作为主视点,其他视点作为从视点,在每个传输路径中传输的视频帧均为该传输路径对应的主视点与除主视点之外的其他从视点拼接后得到的图像进行编码得到的视频帧,即视频帧主要是通过主视点和从视点的图像拼接得到,且所述主视点对应的图像的分辨率大于所述从视点对应的图像的分辨率,即第一图像的分辨率大于第二图像的分辨率;编码完成后,在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备;例如,根据所述视点生成显示指令获取所述显示设备的主视点之后,将所述主视点对应的图像帧序列由所述主视点对应的传输路径传输至解码设备。In this embodiment, the image collected by each camera is the image corresponding to a viewpoint, one of the viewpoints is taken as the main viewpoint, and the other viewpoints are taken as the slave viewpoints, and the video frames transmitted in each transmission path are corresponding to the transmission path The video frame obtained by encoding the main viewpoint and the images obtained by splicing other secondary viewpoints except the primary viewpoint, that is, the video frame is mainly obtained by splicing the images of the primary viewpoint and the secondary viewpoint, and the image corresponding to the primary viewpoint The resolution is greater than the resolution of the image corresponding to the viewpoint, that is, the resolution of the first image is greater than the resolution of the second image; after the encoding is completed, when the decoding device receives the viewpoint generation display instruction sent by the display device, After the viewpoint generation and display instruction acquires the current viewpoint of the display device, the image frame sequence corresponding to the current viewpoint is transmitted to the decoding device through the transmission path corresponding to the current viewpoint; for example, according to the viewpoint generation and display instruction, the obtained After the main viewpoint of the display device is displayed, the image frame sequence corresponding to the main viewpoint is transmitted to the decoding device through the transmission path corresponding to the main viewpoint.
本实施例根据上述技术方案,由于采用了获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备的技术手段,实现了对不同视点对应的图像进行编码。In this embodiment, according to the above-mentioned technical solution, since the images of various viewpoints taken by each camera are acquired, different cameras shoot images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and the main viewpoint is Other viewpoints are used as the secondary viewpoint corresponding to the main viewpoint to generate the second image of the secondary viewpoint, and the image includes at least one of a viewpoint picture or a viewpoint depth map picture; the first image corresponding to each main viewpoint and The second image corresponding to the main viewpoint is spliced to obtain the video frame corresponding to the main viewpoint, and the spliced video frame corresponding to the main viewpoint is encoded according to the shooting time to generate a corresponding image frame sequence, wherein, The resolution of the first image is greater than the resolution of the second image; when the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, The technical means of transmitting the image frame sequence corresponding to the current viewpoint to the decoding device through the transmission path corresponding to the current viewpoint realizes encoding of images corresponding to different viewpoints.
如图4所示,图4为本申请的第四实施例,基于第三实施例步骤S220,本申请的第四实施例包括以下步骤:As shown in Figure 4, Figure 4 is the fourth embodiment of the present application, based on step S220 of the third embodiment, the fourth embodiment of the present application includes the following steps:
步骤S221,将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧;Step S221, splicing the first image corresponding to each main viewpoint and the second image of the secondary viewpoint corresponding to the main viewpoint to obtain a video frame corresponding to the main viewpoint;
步骤S222,按照拍摄时间将所述主视点对应的视频帧进行排序,生成拼接图像序列;Step S222, sorting the video frames corresponding to the main viewpoint according to the shooting time to generate a spliced image sequence;
步骤S223,对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列。Step S223, encoding the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints.
在本实施例中,在点播观看的应用场景下时,观看的视频帧实际为历史视频帧,将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,可根据拍摄时间将所述主视点对应的视频帧进行排序,生成每个主视点对应的拼接图像序列;在得到拼接图像序列之后,对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列,其中,将每个主视点对应的图像帧序列中的第一帧图像编码为I帧。In this embodiment, in the application scenario of on-demand viewing, the watched video frames are actually historical video frames, and the first image corresponding to each main viewpoint and the second image of the secondary viewpoint corresponding to the main viewpoint are spliced Obtain the video frames corresponding to the main viewpoint, sort the video frames corresponding to the main viewpoint according to the shooting time, and generate a spliced image sequence corresponding to each main viewpoint; after obtaining the spliced image sequence, encode the spliced image sequence To obtain an image frame sequence corresponding to each main viewpoint, wherein the first frame image in the image frame sequence corresponding to each main viewpoint is encoded as an I frame.
在本实施例中,对于多路码流,视点切换只能在I帧处进行,所述起始帧为I帧,也称为关键帧,I帧为内部画面,I 帧通常是每个图像帧序列中的第一个帧,经过适度地压缩,做为随机访问的参考点,可以当成图像,在编码的过程中,编码可按照1秒作为一个码流分片,间隔1秒插入一个起始帧,并以1秒为长度,生成码流分片且每个码流分片以I帧作为起始帧,本申请将每个主视点对应的图像帧序列中的第一帧图像编码为I帧,实现在接收到视点切换指令时,根据所述起始帧以获取所述视点切换指令对应的主视点对应的视频帧。In this embodiment, for multiple code streams, viewpoint switching can only be performed at the I frame, the start frame is the I frame, also known as the key frame, the I frame is an internal picture, and the I frame is usually the frame of each image. The first frame in the frame sequence is moderately compressed and used as a reference point for random access, which can be regarded as an image. During the encoding process, the encoding can be divided into one code stream sliced at 1 second intervals, and a starting frame is inserted every 1 second. start frame, and take 1 second as the length to generate code stream slices and each code stream slice takes I frame as the start frame, and the application encodes the first frame image in the image frame sequence corresponding to each main viewpoint as The I frame is used to obtain the video frame corresponding to the main viewpoint corresponding to the viewpoint switching instruction according to the start frame when the viewpoint switching instruction is received.
本实施例根据上述技术方案,由于采用了将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧;按照拍摄时间将所述主视点对应的视频帧进行排序,生成拼接图像序列;对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列,其中,将每个所述主视点对应的图像帧序列中的第一帧图像编码为I帧的技术手段,从而生成主视点对应的图像帧序列。In this embodiment, according to the above-mentioned technical solution, since the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain the video frame corresponding to the main viewpoint; according to the shooting time, the The video frames corresponding to the main viewpoints are sorted to generate a spliced image sequence; the spliced image sequences are encoded to obtain the image frame sequences corresponding to each of the main viewpoints, wherein the image frame sequences corresponding to each of the main viewpoints are The first frame of the image is encoded as an I-frame technical means, thereby generating an image frame sequence corresponding to the main viewpoint.
如图5所示,图5为本申请的第五实施例,基于第四实施例中的步骤S223,本申请的第五实施例包括以下步骤:As shown in Figure 5, Figure 5 is the fifth embodiment of the present application, based on step S223 in the fourth embodiment, the fifth embodiment of the present application includes the following steps:
步骤S2231,获取所述主视点对应的视频帧的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在主视点对应的视频帧中的位置信息;Step S2231, acquiring the arrangement information of the video frame corresponding to the main viewpoint, the arrangement information at least including the viewpoint identifier of each viewpoint and the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint;
步骤S2232,对所述拼接图像序列进行编码,并将所述排布信息插入编码后的所述拼接图像序列的序列头中以得到所述主视点对应的图像帧序列。Step S2232: Encoding the spliced image sequence, and inserting the arrangement information into a sequence header of the encoded spliced image sequence to obtain an image frame sequence corresponding to the main viewpoint.
在本实施例中,所述视点标识为视点编号,即表示每个视点对应的编号;所述主视点对应的视频帧的排布信息为各个视点基于预设排布方式在所述视频帧的排布所生成的,所述预设排布方式可根据实际情况进行设置;具体的,将每个主视点对应的第一图像以及所述主视点对应的其他所有从视点的第二图像进行拼接得到主视点对应的视频帧,按照拍摄时间将主视点对应的视频帧进行排序,生成拼接图像序列,对所述拼接图像序列进行编码;同时,获取主视点对应的视频帧的排布信息,将所述排布信息插入编码后的所述拼接图像序列的序列头中,从而得到主视点对应的图像帧序列。In this embodiment, the viewpoint identifier is a viewpoint number, that is, the number corresponding to each viewpoint; the arrangement information of the video frame corresponding to the main viewpoint is each viewpoint based on the preset arrangement method in the video frame Arrangement generated, the preset arrangement method can be set according to the actual situation; specifically, the first image corresponding to each main viewpoint and the second images of all other secondary viewpoints corresponding to the main viewpoint are spliced Obtain the video frames corresponding to the main viewpoint, sort the video frames corresponding to the main viewpoint according to the shooting time, generate a stitched image sequence, and encode the stitched image sequence; at the same time, obtain the arrangement information of the video frames corresponding to the main viewpoint, and The arrangement information is inserted into the sequence header of the coded spliced image sequence, so as to obtain an image frame sequence corresponding to the main viewpoint.
在本实施例中,所述主视点对应的视频帧中的各个视点按照预设的排布方式进行排布,采用所述排布方式可对图7中P1-P10视点对应的图像进行拼接以得到主视点对应的视频帧,并生成所述主视点对应的视频帧中每个视点的排布信息,每个视点的排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在主视点对应的视频帧中的位置信息;将所述排布信息作为用户扩展信息,写入主视点对应的视频帧中的图像头中。In this embodiment, each viewpoint in the video frame corresponding to the main viewpoint is arranged according to a preset arrangement method, and the images corresponding to the P1-P10 viewpoints in FIG. Obtain the video frame corresponding to the main viewpoint, and generate the arrangement information of each viewpoint in the video frame corresponding to the main viewpoint, and the arrangement information of each viewpoint includes at least the viewpoint identifier of each viewpoint and the image of each viewpoint The position information in the video frame corresponding to the main viewpoint; the arrangement information is written into the image header in the video frame corresponding to the main viewpoint as user extension information.
本实施例根据上述技术方案,由于采用了获取所述主视点对应的视频帧的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在主视点对应的视频帧中的位置信息;对所述拼接图像序列进行编码,并将所述排布信息插入编码后的所述拼接图像序列的序列头中以得到所述主视点对应的图像帧序列的技术手段,从而生成主视点对应的图像帧序列。According to the above technical solution in this embodiment, since the arrangement information of the video frames corresponding to the main viewpoint is obtained, the arrangement information at least includes the viewpoint identifier of each viewpoint and the image corresponding to each viewpoint in the main viewpoint. the position information in the video frame; the technique of encoding the mosaic image sequence, and inserting the arrangement information into the sequence header of the encoded mosaic image sequence to obtain the image frame sequence corresponding to the main viewpoint means to generate an image frame sequence corresponding to the main viewpoint.
基于同一申请构思,本申请实施例还提供了一种解码设备,如图8所示,所述解码设备包括第一接收模块10、第一发送模块20、第二接收模块30以及第二发送模块40;Based on the same application idea, the embodiment of the present application also provides a decoding device, as shown in Figure 8, the decoding device includes a first receiving module 10, a first sending module 20, a second receiving module 30 and a second sending module 40;
其中,所述第一接收模块10,设置为接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;Wherein, the first receiving module 10 is configured to acquire the current viewpoint of the display device according to the viewpoint generation and display instruction when receiving the viewpoint generation and display instruction sent by the display device;
所述第一发送模块20,设置为将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取所述当前视点的图像,并将所述当前视点的图像发送至所述显示设备以生成当前视点画面;The first sending module 20 is configured to intercept the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and send the image of the current viewpoint to the display equipment to generate the current viewpoint screen;
所述第二接收模块30,设置为在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面,其中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个,且所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率;The second receiving module 30 is configured to acquire the target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device, and obtain the target viewpoint corresponding to the current viewpoint from the transmission path corresponding to the current viewpoint. Intercepting images required to generate the images of the target viewpoint from the video frames of the image frame sequence, and sending the images required to generate the images of the target viewpoint to the display device to generate a target viewpoint picture, wherein the Both the image required for generating the image of the current viewpoint and the image required for generating the image of the target viewpoint include at least one of a viewpoint frame or a viewpoint depth map frame, and the resolution of the frame corresponding to the current viewpoint is Greater than the resolution of the picture corresponding to the target viewpoint;
所述第二发送模块40,设置为在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成当前视点画面。The second sending module 40 is configured to, when the switching condition is met, intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and The images required for generating the image of the target viewpoint are sent to the display device to generate a picture of the current viewpoint.
本申请通过采用上述解码设备,实现对视频流进行解码以得到对应的视点画面。In this application, by using the above-mentioned decoding device, the video stream is decoded to obtain corresponding viewpoint pictures.
基于同一申请构思,本申请实施例还提供了一种编码设备,如图9所示,本申请的编码设备包括图像获取模块50、拼接编码模块60以及数据传输模块70;Based on the same application idea, the embodiment of the present application also provides a coding device. As shown in FIG.
其中,所述图像获取模块50,设置为获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;Wherein, the image acquisition module 50 is configured to acquire images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and the main viewpoint is A viewpoint other than the primary viewpoint is used as a secondary viewpoint corresponding to the main viewpoint to generate a second image of the secondary viewpoint, and the image includes at least one of a viewpoint frame or a viewpoint depth map frame;
所述拼接编码模块60,设置为将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;The splicing and encoding module 60 is configured to splice the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the secondary viewpoint to obtain a video frame corresponding to the main viewpoint, and perform splicing according to the shooting time. The video frame corresponding to the main viewpoint is encoded to generate a corresponding image frame sequence, wherein the resolution of the first image is greater than the resolution of the second image;
所述数据传输模块70,设置为在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。The data transmission module 70 is configured to, when the decoding device receives the viewpoint generation and display instruction sent by the display device, obtain the current viewpoint of the display device according to the viewpoint generation and display instruction, and transfer the image frame corresponding to the current viewpoint to The sequence is transmitted to the decoding device through the transmission path corresponding to the current viewpoint.
本申请通过采用上述编码设备,实现对视点画面进行编码以得到对应的视频流。In the present application, by adopting the above-mentioned encoding device, the viewpoint pictures are encoded to obtain corresponding video streams.
基于同一申请构思,本申请实施例还提供了一种存储介质,所述存储介质存储有视频数据处理程序,所述视频数据处理程序被处理器执行时实现如上所述的视频数据处理方法的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。Based on the same application idea, an embodiment of the present application also provides a storage medium, the storage medium stores a video data processing program, and when the video data processing program is executed by a processor, each of the above-mentioned video data processing methods is implemented. Steps, and can achieve the same technical effect, in order to avoid repetition, no more details here.
由于本申请实施例提供的存储介质,为实施本申请实施例的方法所采用的存储介质,故而基于本申请实施例所介绍的方法,本领域所属人员能够了解该存储介质的具体结构及变形,故而在此不再赘述。凡是本申请实施例的方法所采用的计算机存储介质都属于本申请所欲保护的范围。Since the storage medium provided by the embodiment of the present application is the storage medium used to implement the method of the embodiment of the present application, based on the method introduced in the embodiment of the present application, those skilled in the art can understand the specific structure and deformation of the storage medium, Therefore, I will not repeat them here. All computer storage media used in the methods of the embodiments of the present application belong to the intended protection scope of the present application.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and combinations of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a Means for realizing the functions specified in one or more steps of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一序列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a sequence of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart flow or flows and/or block diagram block or blocks.
应当注意的是,在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的部件或步骤。位于部件之前的单词“一”或“一个”不排除存在多个这样的部件。本申请可以借助于包括有若干不同部件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。While preferred embodiments of the present application have been described, additional changes and modifications to these embodiments can be made by those skilled in the art once the basic inventive concept is appreciated. Therefore, the appended claims are intended to be construed to cover the preferred embodiment and all changes and modifications which fall within the scope of the application.
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application is also intended to include these modifications and variations.

Claims (13)

  1. 一种视频数据处理方法,其中,应用于解码设备;所述视频数据处理方法包括:A video data processing method, wherein it is applied to a decoding device; the video data processing method includes:
    接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;When receiving the viewpoint generation and display instruction sent by the display device, acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction;
    将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述当前视点的图像所需的图像,并将所述生成所述当前视点的图像所需的图像发送至所述显示设备以生成当前视点画面;Intercepting the image required to generate the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and sending the image required to generate the image of the current viewpoint to the The above-mentioned display device is used to generate the current viewpoint picture;
    在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面;When the viewpoint switching instruction sent by the display device is received, the target viewpoint corresponding to the viewpoint switching instruction is obtained, and the generated video frame is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint. The image required for the image of the target viewpoint, and the image required for generating the image of the target viewpoint is sent to the display device to generate the target viewpoint picture;
    在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成所述目标视点的图像。When the switching condition is satisfied, the image required to generate the image of the target viewpoint is intercepted from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint, and the generated image of the target viewpoint is generated The required image is sent to the display device to generate the image of the target viewpoint.
  2. 如权利要求1所述的方法,其中,所述从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像的步骤包括:The method according to claim 1, wherein the step of intercepting the image required to generate the image of the target viewpoint from the video frame of the sequence of image frames received by the transmission path corresponding to the current viewpoint comprises:
    获取所述目标视点对应的视点标识以及所述视频帧的排布信息;Acquiring the viewpoint identifier corresponding to the target viewpoint and the arrangement information of the video frames;
    根据所述排布信息以及所述视点标识确定所述生成所述目标视点的图像所需的图像在所述视频帧中的位置信息;determining, according to the arrangement information and the viewpoint identifier, the position information of the image required for generating the image of the target viewpoint in the video frame;
    在由所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述位置信息对应的目标视点的图像所需的图像。Intercepting an image required to generate an image of a target viewpoint corresponding to the position information from a video frame of the sequence of image frames received by the transmission path corresponding to the current viewpoint.
  3. 如权利要求1所述的方法,其中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个。The method according to claim 1, wherein the images required for generating the image of the current viewpoint and the images required for generating the image of the target viewpoint both include at least one of a viewpoint frame or a viewpoint depth map frame. one.
  4. 如权利要求1所述的方法,其中,所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率。The method according to claim 1, wherein the resolution of the picture corresponding to the current viewpoint is greater than the resolution of the picture corresponding to the target viewpoint.
  5. 如权利要求1所述的方法,其中,所述切换条件包括以下至少一个:The method according to claim 1, wherein the switching condition comprises at least one of the following:
    当前显示的图像对应的视频帧的时间戳与所述目标视点对应的传输路径中的视频帧的时间戳相同;The time stamp of the video frame corresponding to the currently displayed image is the same as the time stamp of the video frame in the transmission path corresponding to the target viewpoint;
    所述从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧的时间戳达到预设时间点。The time stamp of the video frame of the image frame sequence received from the transmission path corresponding to the current viewpoint reaches a preset time point.
  6. 一种视频数据处理方法,其中,应用于编码设备;所述视频数据处理方法包括:A video data processing method, wherein it is applied to a coding device; the video data processing method includes:
    获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;Obtain images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as a main viewpoint to generate a first image, and viewpoints other than the main viewpoint are used as corresponding to the main viewpoint generating said second image from a viewpoint from a viewpoint, said image comprising at least one of a viewpoint frame or a viewpoint depth map frame;
    将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;The first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and the spliced video frame corresponding to the main viewpoint is performed according to the shooting time. encoding to generate a corresponding image frame sequence, wherein the resolution of the first image is greater than the resolution of the second image;
    在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。When the decoding device receives the viewpoint generation and display instruction sent by the display device, after acquiring the current viewpoint of the display device according to the viewpoint generation and display instruction, the image frame sequence corresponding to the current viewpoint is transmitted by the corresponding to the current viewpoint The path is transmitted to the decoding device.
  7. 如权利要求6所述的方法,其中,所述将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列的步骤包括:The method according to claim 6, wherein the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint are spliced to obtain a video frame corresponding to the main viewpoint, and according to the shooting time The step of encoding the spliced video frames corresponding to the main viewpoint to generate a corresponding sequence of image frames includes:
    将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧;Splicing the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the viewpoint to obtain a video frame corresponding to the main viewpoint;
    按照拍摄时间将所述主视点对应的视频帧进行排序,生成拼接图像序列;Sorting the video frames corresponding to the main viewpoint according to the shooting time to generate a spliced image sequence;
    对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列。Encoding is performed on the spliced image sequence to obtain an image frame sequence corresponding to each of the main viewpoints.
  8. 如权利要求7所述的方法,其中,将每个所述主视点对应的图像帧序列中的第一帧图像编码为I帧。The method according to claim 7, wherein the first image frame in the sequence of image frames corresponding to each main viewpoint is encoded as an I frame.
  9. 如权利要求7所述的方法,其中,所述对所述拼接图像序列进行编码以得到每个所述主视点对应的图像帧序列的步骤包括:The method according to claim 7, wherein the step of encoding the stitched image sequence to obtain an image frame sequence corresponding to each of the main viewpoints comprises:
    获取所述主视点对应的视频帧的排布信息,所述排布信息至少包括各个所述视点的视点标识和各个所述视点的图像在主视点对应的视频帧中的位置信息;Obtaining the arrangement information of the video frames corresponding to the main viewpoint, the arrangement information at least including the viewpoint identification of each viewpoint and the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint;
    对所述拼接图像序列进行编码,并将所述排布信息插入编码后的所述拼接图像序列的序列头中以得到所述主视点对应的图像帧序列。Encoding the spliced image sequence, and inserting the arrangement information into a sequence header of the encoded spliced image sequence to obtain an image frame sequence corresponding to the main viewpoint.
  10. 如权利要求9所述的方法,其中,所述各个所述视点的图像在主视点对应的视频帧中的位置信息至少包括:视点的图像的左上角像素在视频帧中的坐标和视点的图像的宽高。The method according to claim 9, wherein the position information of the image of each viewpoint in the video frame corresponding to the main viewpoint includes at least: the coordinates of the pixel at the upper left corner of the image of the viewpoint in the video frame and the image of the viewpoint width and height.
  11. 一种解码设备,其中,所述解码设备包括:A decoding device, wherein the decoding device includes:
    第一接收模块,设置为接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点;The first receiving module is configured to acquire the current viewpoint of the display device according to the viewpoint generation display instruction when receiving the viewpoint generation display instruction sent by the display device;
    第一发送模块,设置为将由所述当前视点对应的传输路径接收到的图像帧序列的视频帧中截取所述当前视点的图像,并将所述当前视点的图像发送至所述显示设备以生成当前视点画面;The first sending module is configured to intercept the image of the current viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the current viewpoint, and send the image of the current viewpoint to the display device to generate current viewpoint screen;
    第二接收模块,设置为在接收到所述显示设备发送的视点切换指令时,获取所述视点切换指令对应的目标视点,并从所述当前视点对应的传输路径接收到的所述图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成目标视点画面,其中,所述生成所述当前视点的图像所需的图像以及所述生成所述目标视点的图像所需的图像均包括视点画面或者视点深度图画面中的至少一个,且所述当前视点对应的画面的分辨率大于所述目标视点对应的画面的分辨率;The second receiving module is configured to acquire the target viewpoint corresponding to the viewpoint switching instruction when receiving the viewpoint switching instruction sent by the display device, and obtain the sequence of image frames received from the transmission path corresponding to the current viewpoint Intercepting the image required for generating the image of the target viewpoint from the video frame of the target viewpoint, and sending the image required for generating the image of the target viewpoint to the display device to generate a target viewpoint picture, wherein the generating the Both the image required for the image of the current viewpoint and the image required for generating the image of the target viewpoint include at least one of a viewpoint picture or a viewpoint depth map picture, and the resolution of the picture corresponding to the current viewpoint is larger than the The resolution of the picture corresponding to the target viewpoint;
    第二发送模块,设置为在满足切换条件时,将由所述目标视点对应的传输路径接收到的图像帧序列的视频帧中截取生成所述目标视点的图像所需的图像,并将所述生成所述目标视点的图像所需的图像发送至所述显示设备以生成当前视点画面。The second sending module is configured to intercept the image required to generate the image of the target viewpoint from the video frame of the image frame sequence received by the transmission path corresponding to the target viewpoint when the switching condition is met, and transmit the generated The image required by the image of the target viewpoint is sent to the display device to generate the current viewpoint picture.
  12. 一种编码设备,其中,所述编码设备包括:An encoding device, wherein the encoding device includes:
    图像获取模块,设置为获取各个摄像机拍摄的各个视点的图像,不同摄像机拍摄不同视点对应的图像,其中,将每个视点作为主视点生成第一图像,并将所述主视点之外的视点作为所述主视点对应的从视点生成所述从视点的第二图像,所述图像包括视点画面或者视点深度图画面中的至少一个;The image acquisition module is configured to acquire images of various viewpoints captured by each camera, and different cameras capture images corresponding to different viewpoints, wherein each viewpoint is used as the main viewpoint to generate the first image, and viewpoints other than the main viewpoint are used as A secondary viewpoint corresponding to the main viewpoint generates a second image of the secondary viewpoint, and the image includes at least one of a viewpoint frame or a viewpoint depth map frame;
    拼接编码模块,设置为将每个主视点对应的第一图像以及所述主视点对应的从视点的第二图像进行拼接得到主视点对应的视频帧,并根据拍摄时间对拼接后的所述主视点对应的视频帧进行编码以生成对应的图像帧序列,其中,所述第一图像的分辨率大于所述第二图像的分辨率;The splicing and coding module is configured to splice the first image corresponding to each main viewpoint and the second image corresponding to the main viewpoint from the secondary viewpoint to obtain a video frame corresponding to the main viewpoint, and perform splicing of the spliced main image according to the shooting time. encoding the video frames corresponding to the viewpoint to generate a corresponding sequence of image frames, wherein the resolution of the first image is greater than the resolution of the second image;
    数据传输模块,设置为在解码设备接收到显示设备发送的视点生成显示指令时,根据所述视点生成显示指令获取所述显示设备的当前视点之后,将所述当前视点对应的图像帧序列由所述当前视点对应的传输路径传输至解码设备。The data transmission module is configured to convert the image frame sequence corresponding to the current viewpoint into The transmission path corresponding to the current viewpoint is transmitted to the decoding device.
  13. 一种存储介质,其中,其上存储有视频数据处理程序,所述视频数据处理程序被处理器执行时实现权利要求1-10中任一项所述的视频数据处理方法的步骤。A storage medium, wherein a video data processing program is stored thereon, and when the video data processing program is executed by a processor, the steps of the video data processing method according to any one of claims 1-10 are realized.
PCT/CN2021/129225 2021-09-02 2021-11-08 Video data processing method, decoding device, encoding device, and storage medium WO2023029207A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111040999.9 2021-09-02
CN202111040999.9A CN113900572A (en) 2021-09-02 2021-09-02 Video data processing method, decoding apparatus, encoding apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2023029207A1 true WO2023029207A1 (en) 2023-03-09

Family

ID=79188550

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/129225 WO2023029207A1 (en) 2021-09-02 2021-11-08 Video data processing method, decoding device, encoding device, and storage medium

Country Status (2)

Country Link
CN (1) CN113900572A (en)
WO (1) WO2023029207A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107615756A (en) * 2015-07-10 2018-01-19 华为技术有限公司 Realize the multi-view point video Streaming Media of quick and smooth viewpoint switch
CN110012310A (en) * 2019-03-28 2019-07-12 北京大学深圳研究生院 A kind of decoding method and device based on free view-point
US20200195997A1 (en) * 2017-09-12 2020-06-18 Panasonic Intellectual Property Corporation Of America Image display method, image distribution method, image display apparatus, and image distribution apparatus
CN111447503A (en) * 2020-04-26 2020-07-24 烽火通信科技股份有限公司 Viewpoint switching method, server and system for multi-viewpoint video
CN111447461A (en) * 2020-05-20 2020-07-24 上海科技大学 Synchronous switching method, device, equipment and medium for multi-view live video
CN111866525A (en) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 Multi-view video playing control method and device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190032670A (en) * 2017-09-18 2019-03-28 쿠도커뮤니케이션 주식회사 video service providing system using multi-view camera
CN110784740A (en) * 2019-11-25 2020-02-11 北京三体云时代科技有限公司 Video processing method, device, server and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107615756A (en) * 2015-07-10 2018-01-19 华为技术有限公司 Realize the multi-view point video Streaming Media of quick and smooth viewpoint switch
US20200195997A1 (en) * 2017-09-12 2020-06-18 Panasonic Intellectual Property Corporation Of America Image display method, image distribution method, image display apparatus, and image distribution apparatus
CN110012310A (en) * 2019-03-28 2019-07-12 北京大学深圳研究生院 A kind of decoding method and device based on free view-point
CN111447503A (en) * 2020-04-26 2020-07-24 烽火通信科技股份有限公司 Viewpoint switching method, server and system for multi-viewpoint video
CN111447461A (en) * 2020-05-20 2020-07-24 上海科技大学 Synchronous switching method, device, equipment and medium for multi-view live video
CN111866525A (en) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 Multi-view video playing control method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113900572A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
US11924394B2 (en) Methods and apparatus for receiving and/or using reduced resolution images
US9161023B2 (en) Method and system for response time compensation for 3D video processing
CN112585978B (en) Generating a composite video stream for display in VR
US20120224025A1 (en) Transport stream structure including image data and apparatus and method for transmitting and receiving image data
JP2005522958A (en) Stereo video sequence coding system and method
US10958950B2 (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
Ahmad Multi-view video: get ready for next-generation television
US10979689B2 (en) Adaptive stereo scaling format switch for 3D video encoding
US20150304640A1 (en) Managing 3D Edge Effects On Autostereoscopic Displays
US11528538B2 (en) Streaming volumetric and non-volumetric video
US10037335B1 (en) Detection of 3-D videos
JP2016163342A (en) Method for distributing or broadcasting three-dimensional shape information
EP3629584A1 (en) Apparatus and method for generating and rendering a video stream
JP2023529748A (en) Support for multi-view video motion with disocclusion atlas
EP2309766A2 (en) Method and system for rendering 3D graphics based on 3D display capabilities
CN111935436B (en) Seamless switching method and system of multiple video streams at playing end
US20150326873A1 (en) Image frames multiplexing method and system
WO2023029252A1 (en) Multi-viewpoint video data processing method, device, and storage medium
WO2023029207A1 (en) Video data processing method, decoding device, encoding device, and storage medium
CN114040184A (en) Image display method, system, storage medium and computer program product
WO2011094164A1 (en) Image enhancement system using area information
KR102658474B1 (en) Method and apparatus for encoding/decoding image for virtual view synthesis
JP2002162953A (en) Image processor, image processing method, contents distribution system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21955736

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE