WO2022100742A1 - Video encoding and video playback method, apparatus and system - Google Patents

Video encoding and video playback method, apparatus and system Download PDF

Info

Publication number
WO2022100742A1
WO2022100742A1 PCT/CN2021/130745 CN2021130745W WO2022100742A1 WO 2022100742 A1 WO2022100742 A1 WO 2022100742A1 CN 2021130745 W CN2021130745 W CN 2021130745W WO 2022100742 A1 WO2022100742 A1 WO 2022100742A1
Authority
WO
WIPO (PCT)
Prior art keywords
video stream
video
target video
target
played
Prior art date
Application number
PCT/CN2021/130745
Other languages
French (fr)
Chinese (zh)
Inventor
郑洛
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2022100742A1 publication Critical patent/WO2022100742A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • the present application relates to the technical field of video processing, and in particular, to a video coding and video playback method, device and system.
  • Surround playback refers to playback of images collected at different angles for the same spatial area in a certain surround direction.
  • displaying the picture of the surround playback to the user is to show the user the images collected by the cameras of each camera position according to the surround direction.
  • the video streams collected by cameras at different positions are independently compressed, so images collected by cameras at other positions cannot be referred to during decompression.
  • the continuous multi-frame images played by the terminal are from cameras at different positions, to decompress a certain frame of image, it is necessary to rely on other images of the image in the corresponding original video stream, which leads to the video stream
  • the transmission bit rate is higher.
  • Embodiments of the present application provide a video encoding and video playback method, device, and system, which help to reduce the transmission bit rate of a video stream, and can be applied to a surround playback scenario.
  • the present application provides a video encoding method.
  • the method includes: first, acquiring multiple original video streams obtained based on video streams collected by multiple cameras for the same spatial area in the same time period. Second, at least one target video stream is generated according to the plurality of original video streams.
  • the target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction.
  • the at least one target video stream is compressed.
  • the encoding stage a certain number of frame images are selected from the original video stream corresponding to each camera according to the set direction to generate the target video stream, and then the target video stream is compressed.
  • the video clips selected from the compressed target video stream can be directly transmitted to the terminal, so that the terminal can decompress the video clips to be played without relying on other images of each original video stream.
  • Decoding helps to reduce the transmission bit rate of the video stream.
  • the time stamps corresponding to the images in the target video stream are consecutive. In this way, it is helpful for the terminal to realize real-time surround playback within the consecutive multiple time stamps, thereby increasing the smoothness of the video picture.
  • the multiple original video streams include a first original video stream, and the target video stream takes a first frame image in the first original video stream as a starting point. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
  • the method further includes: generating and sending an index of the at least one target video stream.
  • the index of the target video stream includes: the identifier of the camera corresponding to the target video stream and the category of the target video stream.
  • the camera corresponding to the target video stream is the camera corresponding to the first frame image in the target video stream.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. In this way, when the number of cameras is large, it helps to save the storage space occupied by the index of the video stream.
  • the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
  • the method further includes: encapsulating the compressed at least one target video stream to obtain multiple segments. Then, the indices of the plurality of segments are generated and transmitted. In this way, during video playback, the terminal can acquire the video clips to be played based on the segment granularity, instead of having to acquire the to-be-played video clips based on the segment granularity or the video stream granularity, which helps to save transmission resources.
  • the present application provides a video playback method, the method comprising: receiving a video clip to be played, the video clip to be played is selected from a first target video stream, and the first target video stream is obtained from a plurality of cameras according to a set direction A video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera in .
  • the multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period.
  • the video clip to be played is decompressed, and the decompressed video clip is played. Since the video clip to be played by the terminal is selected from the first target video stream, the terminal does not need to acquire images compressed based on different original video streams. In this way, it helps to reduce the transmission bit rate of the video stream.
  • the time stamps corresponding to the images in the first target video stream are consecutive. In this way, it is helpful for the terminal to realize real-time surround playback in multiple consecutive time stamps.
  • the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
  • the method before receiving the to-be-played video clip, the method further includes: sending a request message, where the request message is used to request the to-be-played video clip.
  • the request message includes an index of the first target video stream.
  • the index of the first target video stream is used to determine the first target video stream, and the index of the first target video stream includes the identifier of the camera corresponding to the first target video stream and the category of the first target video stream.
  • the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream.
  • the method further includes: receiving an index of at least one target video stream.
  • the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
  • the index of the at least one target video stream may be actively requested by the terminal from the network device, or may be actively pushed by the network device to the terminal.
  • the request message further includes an index of the target segment to which the video segment to be played belongs.
  • receiving the video stream segment to be played includes: receiving the target segment. In this way, it helps to further reduce the transmission bit rate of the video stream.
  • the method further includes: receiving an index of a segment in the at least one target video stream.
  • the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
  • the index of the segment in the at least one target video stream may be actively requested by the terminal from the network device, or may be actively pushed by the network device to the terminal.
  • the method further includes: determining the wraparound direction of the video clip to be played and a timestamp corresponding to the start image of the video clip to be played; then, based on the wraparound direction of the video clip to be played and the video clip to be played The timestamp corresponding to the starting image of the , determines the index of the first target video stream.
  • the wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played; and the first target video stream includes the previous image of the first image in the currently playing video stream Frame images, the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
  • the present application provides a video playback method, the method comprising: determining a video stream segment to be played, wherein the to-be-played video segment is selected from a first target video stream, and the first target video stream is from a set direction from A video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera in the multiple cameras.
  • the multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period. Then, the to-be-played video stream segment is sent to the terminal.
  • the time stamps corresponding to the images in the first target video stream are consecutive.
  • the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
  • the method before determining the to-be-played video stream segment, the method further includes: receiving a request message sent by the terminal, where the request message is used to request the to-be-played video segment.
  • the request message includes an index of the first target video stream
  • the index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream.
  • the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream.
  • the method further includes: determining the first target video stream based on the index of the first target video stream.
  • the method further includes: sending an index of at least one target video stream to the terminal; wherein the at least one target video stream is generated according to the multiple original video streams, and the at least one target video stream includes The first target video stream.
  • the request message further includes an index of the target segment to which the video segment to be played belongs.
  • the method also includes determining the target segment from the first target video stream based on the index of the target segment.
  • sending the video stream segment to be played to the terminal includes: sending the target segment to the terminal.
  • the method further includes: sending an index of a segment in the at least one target video stream to the terminal.
  • the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
  • the network device that executes the method provided by the first aspect may be the same as or different from the network device that executes the method provided by the third aspect.
  • the present application provides a video encoding apparatus.
  • the apparatus may be a chip or a network device.
  • the apparatus is used to perform any one of the methods provided in the first aspect above.
  • the present application may divide the device into functional modules according to the method provided in the first aspect.
  • each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the present application may divide the apparatus into an acquisition unit, a generation unit, a compression unit, and the like according to functions.
  • the apparatus includes: a processor for implementing any one of the methods described in the first aspect above.
  • the apparatus may further include a memory, the memory is coupled to the processor, and the memory is used for storing a computer program.
  • the processor executes the computer program stored in the memory, any one of the methods described in the first aspect can be implemented.
  • the apparatus may also include a communication interface for the device to communicate with other devices, for example, the communication port may be a transceiver, circuit, bus, module or other type of communication interface.
  • the computer program in the memory in this application can be pre-stored or stored after being downloaded from the Internet when the device is used. This application does not uniquely limit the source of the computer program in the memory.
  • the coupling in this embodiment of the present application is an indirect coupling or connection between units or modules, which may be in electrical, mechanical or other forms, and is used for information interaction between units or modules.
  • the present application provides a video playback device.
  • the apparatus is used to perform any one of the methods provided in the second aspect or the third aspect.
  • the present application may divide the device into functional modules according to the method provided in the second aspect or the third aspect.
  • each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the apparatus when the apparatus performs any one of the methods provided in the second aspect, the apparatus may be a terminal, and the present application may divide the apparatus into a receiving unit, a decompression unit, and a playing unit according to functions.
  • the apparatus may be a terminal, and the present application may divide the apparatus into a receiving unit, a decompression unit, and a playing unit according to functions.
  • the apparatus may be a chip or a network device, and the present application may divide the apparatus into a determining unit and a sending unit according to functions.
  • the apparatus may be a chip or a network device, and the present application may divide the apparatus into a determining unit and a sending unit according to functions.
  • the apparatus includes: a processor, configured to implement any one of the methods described in the second aspect or the third aspect.
  • the apparatus may further include a memory coupled to the processor, where the memory is used to store a computer program, and when the processor executes the computer program stored in the memory, any one of the methods described in the second or third aspects above can be implemented.
  • the apparatus may also include a communication interface for the device to communicate with other devices, for example, the communication port may be a transceiver, circuit, bus, module or other type of communication interface.
  • the computer program in the memory in this application can be pre-stored or stored after being downloaded from the Internet when the device is used. This application does not uniquely limit the source of the computer program in the memory.
  • the coupling in this embodiment of the present application is an indirect coupling or connection between units or modules, which may be in electrical, mechanical or other forms, and is used for information interaction between units or modules.
  • the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium.
  • a computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on the video encoding apparatus, the video encoding apparatus is made to execute any one of the methods provided in the first aspect.
  • the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium.
  • a computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on the video playback device, the video playback device is made to execute any one of the methods provided in the second aspect or the third aspect.
  • the present application provides a computer program product that, when executed on a computer, enables any one of the methods provided in the first to third aspects to be executed.
  • the present application provides a chip system, comprising: a processor, where the processor is configured to call and run a computer program stored in the memory from a memory, and execute any one of the methods provided in the first to third aspects.
  • the present application provides a video system, including: a network device and a terminal.
  • the network device may be configured to execute any of the methods provided in the foregoing first aspect
  • the terminal may be configured to execute any of the foregoing methods provided in the second aspect.
  • the network device may also be used to execute any one of the methods provided in the third aspect.
  • the video system further includes other network devices for executing any of the methods provided in the third aspect.
  • any of the above-provided video encoding devices, video playback devices, computer storage media, computer program products or video systems can be applied to the corresponding methods provided above.
  • beneficial effects reference may be made to the beneficial effects in the corresponding method, which will not be repeated here.
  • the names of the above-mentioned video encoding apparatus and video playback apparatus do not limit the devices or functional modules themselves. In actual implementation, these devices or functional modules may appear in other names. As long as the functions of each device or functional module are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalents.
  • FIG. 1 is a schematic diagram of a distribution manner of cameras according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of the relationship between a video stream, slice, segment, I frame and P frame provided by an embodiment of the present application;
  • FIG. 3 is a schematic diagram of video streams collected by multiple cameras under a surround playback scenario provided by the conventional technology
  • Fig. 4 is the schematic diagram of the video stream of a kind of transmission that the conventional technology provides based on Fig. 3;
  • 5A is a schematic structural diagram of a video system provided by an embodiment of the application.
  • 5B is a schematic structural diagram of another video system provided by an embodiment of the present application.
  • 5C is a schematic structural diagram of another video system provided by an embodiment of the application.
  • 5D is a schematic structural diagram of another video system provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of a dual-focus application scenario provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
  • FIG. 9 is a schematic process diagram of a video encoding method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an encapsulation format provided by an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of a video playback method provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a video stream transmitted between a distribution side system and a terminal and a played video stream provided by an embodiment of the application;
  • FIG. 13 is a schematic diagram of another video stream transmitted between a distribution side system and a terminal and a played video stream provided by an embodiment of the present application;
  • FIG. 14 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a video playback device according to an embodiment of the application.
  • FIG. 16 is a schematic structural diagram of another video playback apparatus provided by an embodiment of the present application.
  • Surround playback refers to playback of images collected at different angles for the same spatial area in a certain surround direction.
  • Surround playback includes still surround playback and dynamic surround playback.
  • Still surround playback refers to playback of images collected from the same spatial area at different angles at the same moment in a certain surround direction.
  • Dynamic surround playback refers to the playback of images collected from the same spatial area at different times and at different angles in a certain surround direction.
  • the surround playback involved in this application refers to dynamic surround playback.
  • the surround direction refers to the clockwise or counterclockwise direction based on the angle at which the currently playing image is captured.
  • the space area refers to the area targeted for surround playback, usually an area with a specific focus, such as the space area where the basketball arena is located, or the space area where the stage of the concert is located.
  • surround playback requires video streams collected by multiple cameras distributed in a specific manner for the same spatial area at different angles, and one video stream includes consecutive multiple frames of images.
  • the field of view of each camera in the plurality of cameras has an overlapping area with the spatial area.
  • the fields of view of different cameras may or may not have overlapping areas.
  • This embodiment of the present application does not limit the distribution manner of the plurality of cameras.
  • the plurality of cameras may be distributed in an annular manner, as shown in a panel in FIG. 1 .
  • the plurality of cameras may be distributed in a fan-shaped manner, as shown in b in FIG. 1 .
  • the plurality of cameras may be distributed in a right-angle (ie, 90°) manner, as shown in panel c in FIG. 1 .
  • the plurality of cameras may be distributed in a flat-angle (ie, 180°) manner (or a straight-line manner), as shown in d in FIG. 1 .
  • the plurality of cameras may be distributed evenly or unevenly.
  • the multiple cameras are uniformly distributed as an example for description.
  • the surrounding direction is the clockwise direction based on the distribution of the plurality of cameras, taking the camera that captures the "currently playing image” (or the camera that captures the image used to obtain the "currently playing image") as the reference, or Counterclockwise.
  • Surround playback is to sequentially play the images collected by the multiple cameras (or images obtained by processing based on the images collected by the multiple cameras) according to the surrounding direction.
  • surround playback requires camera synchronization technology to ensure that the multiple cameras capture video streams at the same time.
  • camera synchronization technology it is realized that the multiple cameras all collect images at a certain moment, and the time interval for subsequent acquisition of two adjacent frames of images is the same.
  • Each frame of image in a video stream corresponds to a timestamp, which is used for aligning the image with images in other video streams, and for image transmission, etc. For example, assuming that there are 5 cameras deployed in the acquisition side system of the surrounding playback scene, and the video stream collected by each camera includes 100 frames of images, then according to the sequence of acquisition, the images in each video stream can correspond to timestamp 1- 100.
  • the images in the video stream may include I frames and P frames, and the I frames and P frames indicate the compression mode of the frame images. specific:
  • I-frames represent keyframes.
  • the I frame belongs to intra-frame compression encoding, that is, compression is performed based on the image of the current frame, and during decoding, the image of the current frame can be obtained by decompression based on the compressed data.
  • the P frame represents the difference frame between the image of the current frame and the image of the previous frame of the image of the current frame (specifically, it may be an I frame or a P frame).
  • the P frame belongs to inter-frame compression encoding, that is, compression is performed based on the difference between the image of this frame and the image of the previous frame of the image.
  • inter-frame compression encoding that is, compression is performed based on the difference between the image of this frame and the image of the previous frame of the image.
  • decoding it is decompressed based on the data obtained after compression and the image of the previous frame. , to get the image of this frame.
  • a video stream may include multiple slices, and each slice includes one frame of image or consecutive multiple frames of images. Each fragment can be encoded and decoded independently and transmitted independently.
  • the first frame of image in a slice is an I frame, and other images may be I frames or P frames.
  • the first picture in a slice is an I-frame, and the other pictures are P-frames.
  • the image is encapsulated to obtain segments.
  • a segment may include one frame of image or consecutive multiple frames of images, and each frame of image may be an I frame or a P frame.
  • a shard can contain multiple segments. Segments cannot be encoded and decoded independently, but they can be transmitted independently.
  • the encapsulation manner may include chunk encapsulation or transport stream (transport stream, TS) encapsulation, and the like.
  • FIG. 2 it is a schematic diagram of the relationship among a video stream, a slice, a segment, an I frame and a P frame according to an embodiment of the present application.
  • Figure 2 shows that the video stream includes multiple slices, such as slice 1 and slice 2, etc.
  • One slice includes 10 frames of images, and the first frame image in each slice is an I frame, and the other images are P frames ;
  • a segment includes a frame of image as an example to illustrate.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • first and second are only used for description purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature defined as “first” or “second” may expressly or implicitly include one or more of that feature.
  • plural means two or more.
  • each A video stream contains multiple groups of pictures (GOPs), wherein the first picture of each GOP is an I frame, and other pictures can be P frames. The more I frames, the higher the transmission bit rate.
  • GOPs groups of pictures
  • video streams 1-5 are acquired by cameras 1-5 deployed clockwise, and each video stream contains 100 frames of images.
  • the encoder compresses the five video streams respectively.
  • the terminal starts playing from the first frame of video stream 1 (that is, normal playback), and when playing the fourth frame of video stream 1, the terminal determines that it needs to start clockwise playback under the user's instruction.
  • the video clips played by the terminal sequentially include: the fifth frame in video stream 2, the sixth frame in video stream 3, the seventh frame in video stream 4, and the fifth frame in video stream 5. 8 frames of images, the 9th frame of images in video stream 1, the 10th frame of images in video stream 2... as shown in Figure 4.
  • the fifth frame image and the fourth frame image in the video clip played by the terminal are selected from different video streams.
  • the encoder compresses the video stream collected by the same camera respectively, as shown in Figure 3. Therefore, during surround playback, when the terminal needs to correctly decode the fifth frame of image, it cannot refer to the video stream in the playback video stream. 4 frames of images, so, in the encoding stage, the 5th frame of images needs to be set as I frame. Similarly, other images that follow in the surround playback stage have the same problem. This causes the problem of high transmission bit rate.
  • the first frame image of each GOP is an I frame, and other images are P frames
  • the first frame image of the surround playback is a non-first frame image of a GOP.
  • the terminal needs to decode the first frame image of the GOP in sequence to obtain the first frame image of the surround playback, which will result in the inability to perform the surround playback in real time. Therefore, in order to realize real-time surround playback, in the traditional technology, the GOP is generally set very small. For example, a GOP includes one or two frames of images, which will cause a lot of I frames in a video stream, resulting in a high video transmission bit rate. The problem.
  • a GOP includes 5 frames of images
  • the terminal needs to decode the first frame of images in the GOP to obtain the second frame of images.
  • the second frame of image is decoded to obtain the third frame of image, and so on, the fifth frame of image is obtained by decoding the fourth frame of image, and the process takes a long time, so real-time surround playback cannot be realized.
  • the embodiments of the present application provide a video encoding method and a video playback method, which can be applied to a surround playback scenario.
  • the network device records the original video stream corresponding to each camera in the surround playback scenario according to a set direction. Select a certain number of frame images to generate the target video stream, and then compress the target video stream. Based on this, in the surround playback stage, the network device transmits the video clips selected from the compressed target video stream to the terminal, so that the terminal can decode the video clips to be played, and does not need to transmit the video clips based on different original video streams to the terminal. Compressed image. Therefore, it helps to reduce the transmission bit rate of the video stream.
  • the compression of the target video stream can be equivalent to that an image in one original video stream can refer to an image in another original video stream. to compress.
  • the terminal can decode an image in one original video stream to obtain an image in another original video stream.
  • the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream.
  • FIG. 5A it is a schematic structural diagram of a video system 1 according to an embodiment of the present application.
  • the video system 1 includes: an acquisition side system 10, an encoder 20, and one or more terminals (eg, a terminal 30 and a terminal 31).
  • the acquisition side system 10 includes a plurality of cameras.
  • the multiple cameras are distributed and deployed in a certain spatial area based on a specific manner (as shown in FIG. 1 ), so as to collect video streams for the spatial area from different angles.
  • Each camera can be a fixed focus camera.
  • the focal points of the multiple cameras may correspond to the same focus area, or may correspond to multiple focus areas. That is to say, the surround playback of the embodiments of the present application can be applied to a single-focus scene, and can also be applied to a multi-focus scene (eg, a bi-focus scene).
  • FIG. 6 it is a schematic diagram of a dual-focus application scenario.
  • Fig. 6 is a scene set for a basketball game. Multiple cameras are deployed around the basketball court. The multiple cameras are distributed in a ring shape. Each camera has a focal point. The focus of some cameras is in the first focus area, and the other cameras are in the first focus area. The focal point is in the second focal area, thus forming a bifocal scene.
  • the focus of a part of the cameras is positioned to the first focus area of the basketball arena by manual focusing, and the focus of another part of the cameras is positioned to the second focus area of the basketball arena, and the camera synchronization technology is used to achieve the multiple
  • the camera captures an image at a certain moment, and the time interval for subsequent capture of two adjacent frames of images is the same.
  • the encoder 20 is configured to execute the video encoding method provided by the embodiment of the present application.
  • the specific implementation of the video encoding method provided by the embodiments of the present application may refer to the following, for example, refer to the video encoding method shown in FIG. 8 .
  • the terminal which may also be referred to as a playback terminal, is used to decode and play video clips in the video stream.
  • This embodiment of the present application does not limit the physical form of the terminal, for example, it may be a smart phone, a tablet computer, or the like.
  • the encoder 20 is a functional module, and its functions can be implemented by software, or by hardware, or by software combined with hardware.
  • the acquisition-side system 10 and the encoder 20 are independently installed.
  • the acquisition-side system 10 may further include a control node connected to the above-mentioned multiple cameras, and the control node may control the multiple cameras.
  • the encoder 20 may be integrated in the control node.
  • the encoder 20 can also be independent of the acquisition-side system 10.
  • the multiple cameras can send the captured video streams to the control node, and the control node sends the video streams (or the processed video streams) to the encoder 20.
  • the encoder 20 is further configured to communicate with the terminal, so as to provide the terminal with video segments in the encoded video stream, so as to realize video playback.
  • the video system 1 may further include: a distribution side system 40 .
  • the encoder 20 may be integrated in the distribution-side system 40 , or may be provided independently from the distribution-side system 40 .
  • the encoder 20 and the distribution-side system 40 are independently installed as an example for description.
  • the encoder 20 can also be used to send the encoding result to the distribution-side system 40 .
  • the distribution-side system 40 is configured to communicate with the terminal, so as to distribute the video clips in the encoded video stream to the terminal, so as to realize video playback.
  • the distribution-side system 40 may be a content delivery network (content delivery network, CDN).
  • CDN content delivery network
  • the CDN may be any kind of CDN in the conventional technology, but is of course not limited to this.
  • the function of CDN can be implemented by one server or by multiple servers.
  • the distribution-side system 40 may be one or more dedicated servers, that is, servers or server clusters specially set up to implement the video playback method provided by the embodiments of the present application. In general, using a CDN for video distribution can save costs compared to using a dedicated server.
  • the video system may also include an origin station 50 .
  • the source site 50 is the data source of the CDN.
  • the encoder 20 may be integrated in the source station 50 .
  • the encoder 20 may also be independent of the source station 50.
  • the encoder 20 may send the encoding result of the video encoding method provided by the embodiment of the present application to the CDN via the source station 50.
  • the source station 50 may be implemented by one server, or may be implemented by multiple servers together.
  • FIGS. 5A to 5D are all examples of the video systems applicable to the embodiments of the present application, and they are not applicable to the video systems to which the video encoding and video playback methods provided by the embodiments of the present application are applicable. constitute a limitation.
  • the above-mentioned device for realizing the function of the encoder 20, the server for realizing the function of the distribution side system 40, and the terminal can all be realized by the computer device 70 as shown in FIG. 7 .
  • a computer device 70 may be used to implement the video encoding method or the video playing method provided by the embodiments of the present application.
  • the computer device 70 is a device that implements the functions of the encoder 20, it is used to implement the video encoding method provided by the embodiment of the present application, and optionally, it is also used to implement the video playback method provided by the embodiment of the present application.
  • the computer device 70 is a server or a terminal that implements the functions of the distribution-side system 40, it is used to implement the video playback method provided by the embodiment of the present application.
  • the computer device 70 shown in FIG. 7 may include a processor 701, a memory 702, a communication interface 703, and a bus 704.
  • the processor 701 , the memory 702 and the communication interface 703 may be connected through a bus 704 .
  • the processor 701 is the control center of the computer device 70, and may specifically be a general-purpose central processing unit (central processing unit, CPU), or other general-purpose processors, or the like. Wherein, the general-purpose processor may be a microprocessor or any conventional processor or the like.
  • processor 701 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 7 .
  • Memory 702 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM) or other type of static storage device that can store information and instructions
  • ROM read-only memory
  • RAM random access memory
  • a dynamic storage device that can also be an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium, or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • EEPROM electrically erasable programmable read-only memory
  • magnetic disk storage medium or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • the memory 702 may be independent of the processor 701 .
  • the memory 702 may be connected to the processor 701 through a bus 704 for storing data, instructions or program codes.
  • the processor 701 calls and executes the instructions or program codes stored in the memory 702, the video encoding method or the video playing method provided by the embodiments of the present application can be implemented.
  • the memory 702 can also be integrated with the processor 701 .
  • the communication interface 703 is used for connecting the computer device 70 with other devices through a communication network, and the communication network can be an Ethernet, a radio access network (RAN), a wireless local area network (wireless local area networks, WLAN) and the like.
  • the communication interface 703 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
  • the bus 704 may be an industry standard architecture (industry standard architecture, ISA) bus, a peripheral component interconnect (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
  • the structure shown in FIG. 7 does not constitute a limitation on the computer device 70.
  • the computer device 70 may include more or less components than those shown in the figure, or a combination of certain components may be included. some components, or a different arrangement of components.
  • the computer device 70 may further include a display screen, an audio input and output device, and the like, which is not limited in this embodiment of the present application.
  • FIG. 8 it is a schematic flowchart of a video coding method provided by an embodiment of the present application.
  • This embodiment is described by taking the method applied to the video system shown in FIG. 5C as an example.
  • the multiple cameras in this embodiment may be some or all of the cameras in the acquisition-side system 10 in FIG. 5C
  • the encoder and the distribution-side system in this embodiment may be the encoders 20 and 20 in FIG. 5C , respectively.
  • Distribution side system 40 it is a schematic flowchart of a video coding method provided by an embodiment of the present application. This embodiment is described by taking the method applied to the video system shown in FIG. 5C as an example.
  • the multiple cameras in this embodiment may be some or all of the cameras in the acquisition-side system 10 in FIG. 5C
  • the encoder and the distribution-side system in this embodiment may be the encoders 20 and 20 in FIG. 5C , respectively.
  • Distribution side system 40 it is a schematic flowchart of a video coding method provided by an embodiment of the present application
  • the method shown in FIG. 8 may include the following S101-S108:
  • S101 Multiple cameras collect multiple video streams for the same spatial area in the same time period. For example, a camera captures a video stream for the spatial region during the time period.
  • the explanation of the spatial area and the like may refer to the above.
  • the camera can continuously collect images, and the same time period in S101 can be any time period in the process of the camera collecting images.
  • the camera can take the video stream captured at each cycle as a raw video stream.
  • the duration of the cycle is equal to the duration of the time period in S101.
  • the multiple cameras send multiple original video streams to the encoder.
  • one camera corresponds to one raw video stream.
  • the original video stream corresponding to one camera may be the video stream collected by the camera in S101, or the video stream obtained by the camera after processing the video stream collected in S101.
  • the video stream received by the encoder is defined as the original video stream.
  • the encoder generates at least one target video stream according to the multiple original video streams.
  • the target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction.
  • the set direction is a surrounding direction of the plurality of cameras, such as a clockwise surrounding direction or a counterclockwise surrounding direction.
  • the number of frames of video streams selected from different original video streams may be the same or different.
  • the time stamps corresponding to the images in the target video stream are consecutive.
  • the following specific examples are all described by taking the continuous time stamps corresponding to the images in the target video stream as an example for description. Here, a unified description is provided, and details are not repeated below.
  • the multiple original video streams include a first original video stream
  • the first original video stream may be any one of the multiple original video streams.
  • the target video stream starts with the first frame image in the first original video stream. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
  • each original video stream and each target video stream include the same number of frames of images.
  • the number of target video streams generated by the encoder based on different original video streams is the same.
  • S103 may include: for each original video stream in the multiple original video streams, the encoder generates a first-type video stream and a second-type video stream.
  • Each of the original video streams, each of the first type of video streams, and each of the second type of video streams include the same number of frames of images.
  • the first type of video stream corresponding to the first original video stream is, starting from the first frame image in the first original video stream, and according to the first surround direction of the plurality of cameras, from the original video stream collected by each camera.
  • the video stream obtained by selecting the first number of frame images.
  • the time stamps corresponding to the images in the first type of video stream are continuous.
  • the second type of video stream corresponding to the first original video stream is, taking the first frame image in the first original video stream as a starting point, and selecting from the original video streams collected by each camera according to the second surround direction of the plurality of cameras.
  • the video stream obtained from the second number of frame images.
  • the time stamps corresponding to the images in the second type of video stream are continuous.
  • the first wraparound direction is opposite to the second wraparound direction.
  • the first surrounding direction is a clockwise direction
  • the second surrounding direction is a counterclockwise direction.
  • the encoder may not generate the second type of video stream or the first type of video stream. If the encoder generates the first type of video stream, the terminal may perform surround playback based on the first surround direction. If the encoder generates the second type of video stream, the terminal may perform surround playback based on the second surround direction.
  • the first numbers corresponding to different original video streams may or may not be equal.
  • the first quantity corresponding to an original video stream refers to the quantity of images selected from the original video stream each time in the process of generating the first type of video stream.
  • the second numbers corresponding to different original video streams may or may not be equal.
  • the second quantity corresponding to an original video stream refers to the quantity of images selected from the original video stream each time in the process of generating the second type of video stream.
  • each of the first quantities and each of the second quantities may or may not be equal.
  • the following description is given by taking an example that each of the first quantities and each of the second quantities are equal.
  • Both the first quantity and the second quantity may be values entered by the administrator.
  • the first quantity and the second quantity can be updated.
  • Both the first number and the second number may be an integer greater than or equal to 1.
  • the acquisition side system includes cameras 1-5 arranged in a clockwise direction
  • the original video streams corresponding to cameras 1-5 are original video streams 1-5 respectively
  • each original video stream includes 100 frames of images
  • the encoder can generate the first type video stream i and the second type video stream i based on the original video stream i.
  • 1 ⁇ i ⁇ 5 i is an integer.
  • the first wrapping direction is clockwise
  • the second wrapping direction is counterclockwise
  • both the first and second numbers are 1, then:
  • the first type of video stream 1 takes the first frame image in the original video stream 1 as the starting point, and selects 1 from the original video stream 1, 2, 3, 4, 5, 1, 2, 3, 4, 5...
  • the video stream obtained after the frame image, the time stamps corresponding to the images in the video stream are continuous.
  • the images in the first type of video stream 1 are: the first frame image in the original video stream 1, the second frame image in the original video stream 2, the third frame image in the original video stream 3, the original video The 4th image in stream 4, the 5th image in original video stream 5, the 6th image in original video stream 1, the 7th image in original video stream 2...the 100th image in original video stream 5 frame image.
  • the encoder can obtain the first type of video streams 2-5, as shown in FIG. 9 .
  • the second type of video stream 1 takes the first frame image in the original video stream 1 as the starting point, and selects 1 from the original video stream 1, 5, 4, 3, 2, 1, 5, 4, 3, 2...
  • the video stream obtained after the frame image, the time stamps corresponding to the images in the video stream are continuous.
  • the images in the second type of video stream 1 are sequentially: the first frame image in the original video stream 1, the second frame image in the original video stream 5, the third frame image in the original video stream 4, the original video The 4th image in stream 3, the 5th image in original video stream 2, the 6th image in original video stream 1, the 7th image in original video stream 5...
  • the encoder can obtain the second type of video streams 2-5, as shown in FIG. 9 .
  • a certain number of frames of images are selected from an original video stream, and the number can be considered as a rotation interval, that is, in the process of surround playback, one video stream switches to another video stream after playing a few frames.
  • a rotation interval that is, in the process of surround playback, one video stream switches to another video stream after playing a few frames.
  • one video stream is switched to another video stream after playing one frame of image.
  • S104 The encoder generates an index of each video stream in the multiple original video streams and the at least one target video stream.
  • the index of a video stream includes the identifier of the camera corresponding to the video stream and the category of the video stream.
  • the camera corresponding to the original video stream refers to the camera that collects the original video stream, or the camera that collects the original video stream for obtaining the original video stream.
  • the camera corresponding to the target video stream refers to the camera corresponding to the first frame image in the target video stream.
  • the first frame of image is acquired by a camera or obtained by processing an image acquired by a camera
  • the first frame of image corresponds to the pair of cameras.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. In this way, when the number of cameras is large, it helps to save the storage space occupied by the index of the video stream.
  • the category of the video stream can be used to characterize whether the video stream is an original video stream, a first-type video stream or a second-type video stream.
  • the index of each video stream can be as shown in Table 1:
  • the encoder may sequentially number each video stream in the multiple original video streams and the multiple target videos, and use the number of each video stream as an index of the video stream. For example, based on the example shown in FIG. 9 , there are 5 original video streams, 5 first-type video streams and 5 second-type video streams, the indices of these 15 video streams may be 1-15 in sequence.
  • the encoder determines I frames and P frames in the multiple original video streams and the at least one target video stream based on the number of frames of images included in one slice, and determines the I frames and P frames for each video after determining the I frames and P frames. stream is compressed.
  • the number of frames of an image included in a slice may be predefined, and may be updated after being predefined.
  • the encoder may determine the 1st, 11th, 21st, 31st, . . . 91 frames of images in the video stream as I frames, and other images as P frames.
  • S106 The encoder performs fragmentation processing on the compressed multiple original video streams and the compressed at least one target video stream obtained in S105, respectively, to obtain a plurality of fragments, and generates an index of each fragment.
  • the slices corresponding to the same timestamp include the same number of frames of images. This facilitates the alignment of corresponding video streams at the segment boundary, thereby facilitating segment download alignment.
  • the time stamp corresponding to the fragment may be represented by the time stamp corresponding to the image at a specific position included in the fragment, or by the position of the fragment in the video stream to which it belongs.
  • the frames of the images included in each slice obtained in S106 are the same, so that the encoding is simple and fast.
  • the index of the slice may include the index of the slice in the compressed video stream to which it belongs.
  • each video stream in Figure 9 can be divided into 10 slices after compression, and, for each compressed video stream, the first -10 frames of images constitute slice 1 of the video stream, 11-20 frames of images constitute slice 2 of this video stream... 91-100 frames of images constitute slice 10 of this video stream.
  • the indices of the slices into which each compressed video stream is divided are in the order of 1-10.
  • FIG. 9 illustrates that the video stream includes slices. In fact, it should be understood that the compressed video stream includes slices.
  • each fragment can be independently encoded, decoded and transmitted independently, the encoder can perform fragmentation processing on the video stream, and the video stream can be transmitted and decoded based on the fragmentation granularity during video playback, thereby Helps save transmission resources. Fragmentation of the video stream by the encoder is an optional step. If the encoder does not perform slice processing on the video stream, the encoder can set the first frame of image in each original video stream and the target video stream as an I frame, and can also enable the terminal to correctly decode the received video stream.
  • the encoder sends the encoding result to the distribution-side system.
  • the encoding result includes each compressed video stream and index information generated during the encoding process.
  • the index information may include an index of each video stream and an index of each slice in each video stream.
  • various application services may be used between the encoder and the distribution side system to provide users with various application services through the Internet, which is of course not limited to this.
  • the distribution-side system Based on the encoding result, the distribution-side system generates and stores the correspondence between "the identifier of the video content, the compressed video stream, and the index information".
  • a video stream obtained by compressing video streams collected from different angles in the same spatial area in the same time period corresponds to a segment of video content as a whole, and a segment of video content corresponds to an identifier (ie, an identifier of the video content).
  • Different video content corresponds to the video stream obtained by compressing the video stream collected for the same spatial area in different time periods; or, for different video content, corresponding to the video stream collected in the same time period (or different Compressed video stream.
  • the corresponding relationship stored in S108 may include: “identification of video content, 15 compressed video streams (ie original video streams 1-5, first-type video streams 1-5 and The correspondence between the second type of video streams 1-5), the index of each video stream in the 15 video streams, and the index of each slice of each video stream in the 15 video streams”.
  • the identifier of the video content may be a uniform resource locator (uniform resource locator, URL), which is used to represent the storage location of the encoding result corresponding to the video content in the distribution-side system.
  • URL uniform resource locator
  • the terminal may request the video content from the distribution-side system based on the identifier.
  • the URL can be used to represent "compressed original video streams 1-5, compressed first-type video streams 1-5, and compressed second-type video streams 1-5 , and the indices of these video streams, the indices of the slices in these video streams, etc.” storage locations in the distribution-side system.
  • the video encoding method provided by the embodiment of the present application, according to the set direction, a certain number of frame images are selected from the original video stream collected by each camera to obtain the target video stream, and the original video stream and the target video stream are respectively compressed.
  • the encoder compresses the target video stream, which can be equivalent to that an image in one original video stream can be compressed with reference to an image in another original video stream.
  • the video clips provided by the distribution-side system to the terminal can be directly selected from the target video stream.
  • an image in another original video stream can be obtained by decoding based on an image in one original video stream. Therefore, compared with the conventional technology, the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream.
  • the video coding method provided by the embodiment of the present application does not need to combine multiple video streams into one high-resolution picture, so the number of cameras and picture resolution are not limited. That is to say, after using the technical solution for video encoding, in the video playback stage, the transmission bit rate of the video stream can be reduced under the condition of ensuring the picture resolution.
  • the encoder may not perform fragmentation on the compressed multiple original video streams and the compressed at least one target video stream. Based on this, the above S105-S106 can be replaced with the following steps 1-2:
  • Step 1 The encoder determines the first frame image in each original video stream and each target video stream as an I frame, and other images are determined as I frames or P frames (for example, other images are determined as P frames), to determine Each video stream after I-frame and P-frame is compressed.
  • Step 2 The encoder separately encapsulates each compressed original video stream and each compressed target video stream to obtain multiple segments, and generates an index of each segment.
  • the index information in S107 and S108 may include: the index of each video stream and the index of each segment in each video stream.
  • the encoder may not perform fragmentation processing on the compressed video stream, but directly encapsulate it. This is a technical solution provided considering that segments can be transmitted independently.
  • the encoder may encapsulate images in the compressed multiple original video streams and the compressed at least one target video stream obtained in S105. Based on this, S106 may include: the encoder performs fragmentation processing on the multiple original video streams and at least one target video stream after performing encapsulation.
  • the terminal can obtain video segments to be played from the distribution-side system based on the segment granularity, instead of obtaining the to-be-played video from the distribution-side system based on the segment granularity or video stream granularity fragments, thereby helping to save transmission resources.
  • the number of frames of an image included in a segment may be predefined, and may be updated after being predefined.
  • a slice can consist of one or more segments.
  • the surround playback can be started, and the encoder can use
  • the segmentation method encapsulates each frame of image in each of the compressed multiple original video streams and the compressed at least one target video stream into a segment respectively.
  • FIG. 10 illustrates an example of encapsulating images belonging to one slice based on the fmp4chunk method. Specifically, each frame of the image in the slice is encapsulated into an independent mdat, and encapsulated in a multi-moof header manner. Among them, the moof header includes styp, sidx and moof. In this way, each frame of image in the video stream can be transmitted independently as a segment.
  • the first image in the slice is an I frame, and the other images are P frames.
  • the index information in S107 may also include the The index of each segment.
  • the embodiment of the present application does not limit the specific implementation manner of the segmented index.
  • the index of the segment may include the index of the segment in the shard to which it belongs. Taking a shard including 5 segments as an example, the indices of the segments in each shard may be 1-5.
  • the encoder can perform both fragmentation processing and encapsulation on the compressed video stream. Since the first frame image of each fragment is an I frame, and the video stream without fragmentation processing is usually only the first frame image is an I frame, therefore, compared with the technology of "no fragmentation processing, only encapsulation"
  • the scheme in the video playback stage, can reduce the transmission of redundant images to a certain extent.
  • FIG. 11 it is a schematic flowchart of a video playing method according to an embodiment of the present application.
  • This embodiment is described by taking the method applied to the video system shown in FIG. 5C as an example.
  • the terminal in this embodiment may be the terminal 30 or the terminal 31 in FIG. 5C .
  • the distribution-side system in this embodiment may be the distribution-side system 40 in 5C.
  • the method shown in FIG. 11 may include the following S201-S210:
  • S201 The terminal acquires the identifier of the video content to be played.
  • the video content to be played refers to the video content that the user desires to play, such as a football game or a concert.
  • the video content to be played in S201 may be the video content in the above-mentioned S108.
  • the video content includes: 15 compressed video streams (that is, the original video streams 1-5, the first type of video Streams 1-5 and the second type of video streams 1-5), the index of each of the 15 video streams, and the index of each slice of each of the 15 video streams.
  • the embodiment of the present application does not limit the specific form of the identifier of the video content to be played.
  • the identifier of the video content to be played may be a URL of the video content to be played.
  • This embodiment of the present application does not specifically limit the manner of acquiring the identifier of the video content to be played.
  • the user can select the video content to be played by clicking a video link on the terminal screen, and the terminal receives the user's operation instruction and obtains the URL of the to-be-played video content based on the operation instruction.
  • the distribution-side system may actively push URLs of one or more video contents to the terminal, where the one or more video contents include URLs of the video contents to be played.
  • the terminal Based on the identifier of the video content to be played, the terminal obtains index information corresponding to the video content to be played from the distribution side system.
  • the index information may include an index of a video stream corresponding to the video content to be played and an index of a segment in the video stream (eg, an index of each video stream and an index of each segment in each video stream).
  • the terminal may send a request to the distribution-side system, where the request carries the identifier of the video content to be played, and is used to request index information corresponding to the video content to be played.
  • the distribution-side system acquires and feeds back index information corresponding to the to-be-played video content to the terminal based on the identifier of the to-be-played video content.
  • the distribution-side system can also actively push the index information corresponding to the video content to be played to the terminal.
  • S203 The terminal requests an initial video stream of the video content to be played from the distribution side system, decompresses the requested initial video stream, and plays the decompressed initial video stream.
  • the initial video stream can be predefined.
  • the initial video stream may be an original video stream corresponding to the to-be-played video content, which may be specified by the administrator or determined by the distribution-side system according to the size of the index.
  • S203 may include: taking the transmission of video streams between the terminal and the distribution-side system based on fragmentation granularity as an example, the terminal requests the distribution-side system for a part of the fragments in the initial video stream of the video content to be played, and decompresses the requested To the fragment, play the decompressed fragment.
  • the terminal sequentially requests the first fragment, the second fragment in the initial video stream of the video content to be played, ... until the terminal determines that the surround playback needs to be started. For example, until the terminal receives the first operation in S204, that is, in this example, before the terminal receives the first operation in S204, the initial video stream is played in sequence (ie, the normal playing stage). Wherein, the terminal receives the first operation in S204, indicating that the terminal has a requirement to start performing surround playback of the video content. For example, assuming that the initial video stream is original video stream 1, the terminal can sequentially request slice 1, slice 2, slice 3... in the compressed original video stream 1 from the distribution-side system, and decompress the requested and play the decompressed segments until the terminal executes S204, and stops acquiring the segments in the original video stream 1.
  • S203 is an optional step. If the technical solution of S203 is not performed, it can be understood that: the terminal directly plays the video stream corresponding to the video content to be played in a surround. Executing the technical solution of S203 can be understood as: the terminal first plays (or normally plays) the video stream corresponding to the video content to be played, and then plays the video stream corresponding to the video content to be played in a loop.
  • S204 The terminal receives the first operation, and in response to the first operation, the terminal acquires the wrapping direction of the video clip to be played, the time stamp corresponding to the start image of the video clip to be played, and the time stamp corresponding to the end image of the video clip to be played.
  • the terminal starts to perform surround playback under the instruction of the user.
  • the embodiment of the present application is not limited to this.
  • the terminal may start to perform surround playback when a preset image in a certain video stream is played.
  • the first operation may be a touch operation, a pressing operation, or a voice operation, or a combination of at least two operations described above.
  • the first operation may be a touch operation, a pressing operation, or a voice operation, or a combination of at least two operations described above.
  • the terminal taking the terminal as a terminal with a touch screen such as a smart phone and a tablet computer as an example, the terminal receives a user's touch operation on a rotating control on the touch screen. In response to the touch operation, the terminal determines the wrapping direction of the video clip to be played, the time stamp corresponding to the start image and the time stamp corresponding to the end image of the video clip to be played.
  • the touch operation when the touch operation is to slide the rotation control in the first direction (eg, to the left), it means that the wrapping direction of the video clip to be played is clockwise.
  • the touch operation is to slide the rotation control in the second direction (eg, rightward)
  • it indicates that the wrapping direction of the video clip to be played is counterclockwise. That is, for touch operations in different directions of the same rotating control, different wrapping directions are indicated.
  • the touch operation when the touch operation is a touch operation on the first rotation control, it means that the wrapping direction of the video clip to be played is clockwise.
  • the touch operation is a touch operation to the second rotation control, it indicates that the wrapping direction of the video clip to be played is a counterclockwise direction. That is, for touch operations of different rotating controls, different wrapping directions are indicated.
  • the terminal determines the time stamp corresponding to the start image of the video clip to be played through the start touch time of the touch operation. For example, without considering the delay, the terminal may use the timestamp corresponding to the next frame of the currently playing image when receiving the touch operation as the timestamp corresponding to the start image of the video clip to be played. For another example, in the case of considering the delay, the terminal may use the timestamp corresponding to the next N frames of images of the currently playing image when receiving the touch operation as the timestamp corresponding to the start image of the video clip to be played; wherein, N is Integer greater than 1, N is a predefined value.
  • the terminal determines the duration of the video clip to be played through the touch duration of the touch operation. Subsequently, the terminal may determine the time stamp corresponding to the end image of the to-be-played video clip based on the time-stamp corresponding to the start image of the to-be-played video clip and the duration of the to-be-played video clip.
  • the terminal receives the user's pressing operation on the rotary button.
  • the terminal determines the wrapping direction of the video clip to be played, the time stamp corresponding to the start image and the time stamp corresponding to the end image of the video clip to be played.
  • the terminal may determine the time stamp corresponding to the start image of the video clip to be played through the start pressing time of the pressing operation.
  • the terminal may determine the duration of the to-be-played video clip according to the pressing duration or the number of pressings of the pressing operation.
  • the terminal determines the index of the video stream (ie the first target video stream) to which the video clip to be played belongs from the at least one target video stream based on the wrapping direction of the video clip to be played and the timestamp corresponding to the start image.
  • the wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played.
  • the first target video stream includes an image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the start image of the video clip to be played.
  • the terminal starts to play the first frame of the original video stream 1, and plays to the fourth frame ( When corresponding to time stamp 4), the first operation is received.
  • the terminal determines that the wrapping direction of the video clip to be played is the clockwise direction, then the terminal may determine that the type of the first target video stream is the first type of video based on the wrapping direction of the video clip to be played is the clockwise direction. flow.
  • the terminal determines that the timestamp corresponding to the start image of the video clip to be played is timestamp 5 . Based on this, it can be known that the first image is the fifth frame image in the original video stream 1 .
  • the first target video stream is the first type of video stream including the fourth frame image in the original video stream 1 , that is, the first type of video stream 3 .
  • S206 The terminal determines, based on the time stamp corresponding to the start image and the index of the first target video stream, the index of the first slice to which the start image belongs, and based on the time stamp corresponding to the end image and the first target video stream to determine the index of the second slice to which the starting image belongs.
  • the terminal determines the index of the starting image based on the timestamp corresponding to the starting image and the index of the first target video stream; and determines the termination based on the timestamp corresponding to the ending image and the index of the first target video stream The index of the image. And, based on the index of the starting image, the index of the first slice to which the starting image belongs is determined; and based on the index of the ending image, the index of the second slice to which the ending image belongs is determined.
  • the terminal may directly determine the index of the first slice to which the starting image belongs based on the timestamp corresponding to the starting image and the index of the first target video stream.
  • the terminal may directly determine the index of the second slice to which the starting image belongs based on the timestamp corresponding to the termination image and the index of the first target video stream.
  • the start image is the fifth frame image in the first type of video stream 3
  • the end image is of the first type
  • the first slice and the second slice are slice 1 and slice 2 in the first type of video stream 3, respectively.
  • S207 The terminal sends a request message to the distribution-side system, where the request message includes the index of the first target video stream, the index of the first segment, and the index of the second segment.
  • the distribution-side system searches for the video clip to be played based on the request message.
  • the distribution-side system searches for the first target video stream from multiple original video streams and at least one target video stream based on the index of the first target video stream. Then, looking up the first slice from the first target video stream based on the index of the first slice; and looking up the second slice from the first target video stream based on the index of the second slice.
  • S209 The distribution-side system sends the found video clip to be played to the terminal.
  • the distribution-side system sends the first fragment and the second fragment to the terminal.
  • S210 The terminal decompresses the video clip to be played, and plays the decompressed video clip to be played.
  • the terminal may continue to request and play the images after the starting point in the original video stream using the next frame of image in the original video stream where the last frame of the image in the surround playback stage is located as the starting point.
  • the method may further include: the terminal splicing the to-be-played video clips after the played video clips according to the time stamps corresponding to the received images in the to-be-played video clips.
  • the terminal decompresses the video clip to be played based on the spliced video clip.
  • FIG. 12 it is a schematic diagram of the video stream transmitted between the distribution-side system and the terminal and the video stream played by the terminal provided based on the example in S206 .
  • the video streams transmitted between the distribution-side system and the terminal are in order: slice 1 in the compressed original video stream 1, slices 1-2 in the compressed first type video stream 3, and compressed original video stream 1. Fragments 2, 3 in video stream 4...
  • the terminal After receiving the slice 1 in the compressed original video stream 1, the terminal decompresses the first to fourth frames of images in the compressed original video stream 1.
  • the terminal After receiving the segment 1 in the compressed first type video stream 3, the terminal splices the 5th to 10th frames in the compressed first type video stream 3 to the 1st to 4th frames in the original video stream 1
  • the 5th to 10th frame images in the compressed video stream 3 of the first type are decompressed. Since the fifth frame image in the first type video stream 3 is compressed based on the fourth frame image in the original video stream 1 during encoding, during decoding, the fifth frame image in the compressed first type video stream 3 The frame image can be decompressed based on the 4th frame image in the original video stream 1.
  • the terminal After receiving the segment 2 in the compressed first-type video stream 3, the terminal splices the 11th to 13th frames of the compressed first-type video stream 3 to the fifth frame in the first-type video stream 3. After -10 frames of images, the 11-13 frames of images in the compressed first-type video stream 3 are decompressed. Since the 11th frame image in the first type video stream 3 is an I frame and can be decoded independently, it is also possible to directly perform the 11-13th frame images in the compressed first type video stream 3 without splicing. unzip.
  • the terminal After receiving the slice 2 in the compressed original video stream 4, the terminal splices the 14th-20th frame images in the compressed original video stream 4 to the 11th-13th frame images in the first type video stream 3 After that, the 14th to 20th frame images in the compressed original video stream 4 are decompressed. Because during encoding, the 14th frame image in the original video stream 4 is encoded based on the 13th frame image in the original video stream 4, and the 13th frame image in the original video stream 4 is the same as that in the first type of video stream 3. The 13th frame image is the same, that is to say, the 14th frame image in the original video stream 4 is encoded based on the 13th frame image in the first type video stream 3 . Therefore, during decoding, the 14th frame image in the compressed original video stream 4 can be obtained by decompressing the 13th frame image in the first type of video stream 3 .
  • the distribution-side system selects, at the request of the terminal, a video segment to be played currently required by the terminal from at least one target video stream.
  • a certain number of frame images are selected from the original video stream collected by the camera to obtain the video stream. Since the video clip to be played by the terminal is selected from the first target video stream, the terminal does not need to acquire images compressed based on different original video streams. Therefore, it helps to reduce the transmission bit rate of the video stream.
  • the encoder compresses the target video stream, it can be equivalent that an image in one original video stream can be compressed with reference to an image in another original video stream.
  • the video clips provided by the distribution-side system to the terminal are directly selected from the target video stream. Therefore, the terminal can decode an image in one original video stream to obtain an image in another original video stream. Therefore, compared with the conventional technology, the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream. Specific examples thereof are the examples shown in FIGS. 9 and 12 .
  • the above video coding method does not need to combine multiple video streams into one high-resolution picture, the number of cameras and picture resolution are not limited. That is to say, after video encoding is performed by using the above video encoding method, in the video playback stage, the transmission bit rate of the video stream can be reduced under the condition of ensuring the picture resolution.
  • the determination of the index of the first target video stream, the index of the segment to which the starting image and the ending image of the video clip to be played belong, etc. are all performed on the terminal side, and the system on the distribution side does not perceive it. Therefore, this technology The improvement of the solution to the distribution-side system is small, so it can be applied to the traditional distribution-side system.
  • the above S204-S207 may also be performed by the distribution-side system.
  • the terminal sends information related to the first operation (such as the rotary key for the first operation, the operation duration of the first operation, etc.) to the distribution-side system, and the distribution-side system executes the first operation based on the information related to the first operation. process.
  • the terminal does not need to acquire various index information of the video stream. In this way, the processing pressure of the terminal is relieved.
  • S106 if S106 is replaced with: the compressed multiple original video streams and the compressed at least one target video stream are encapsulated, then:
  • the index information in S202 may also include an index of a segment in the slice, such as an index of each segment in each slice.
  • S206 may further include: the terminal determining the index of the segment to which the starting image belongs in the first slice, and determining the index of the segment to which the termination image belongs in the second slice.
  • the segment to which the start image of the video clip to be played and the segment to which the end image belongs are the 5th segment and the 17th segment in the first-type video stream 3, respectively.
  • the request message in S207 may further include: an index in the first slice of the segment to which the start image belongs, and an index in the second slice of the segment to which the end image belongs.
  • S208 may further include: the distribution-side system searches for the segment to which the start image belongs from the first segment based on the index of the segment to which the start image belongs, and retrieves the segment from the second segment based on the index of the segment to which the end image belongs. Find the segment to which the termination image belongs.
  • S209 may include: the distribution-side system sends to the terminal the segment to which the start image in the first segment belongs, the segment to which the end image in the second segment belongs, and other segments between the two segments part.
  • FIG. 13 it is a schematic diagram of the video stream transmitted between the distribution side system and the terminal and the video stream played by the terminal provided based on the example in S205 and in combination with segmentation.
  • the video streams transmitted between the distribution-side system and the terminal are in sequence: the 1st to 4th frames of the compressed original video stream 1, the 5th to 17th frames of the compressed first-type video stream 3, and the compressed The 18th, 19th... frame images in the original video stream 4 of .
  • fewer images are transmitted between the distribution-side system and the terminal, that is, the images in one slice are encapsulated into multiple Transmitting video streams at segment granularity helps reduce the transmission of redundant images, thereby saving transmission resources.
  • the use of segmented granularity transmission is consistent with the use of segmented transmission during terminal playback, and the first to fourth frames in the original video stream 1, the 5th to 17th frames in the first type of video stream 3 are played in sequence, and The 18th, 19th... frame images in the original video stream 4.
  • the encoder if in the video encoding stage, the encoder does not perform fragmentation processing on the compressed video stream, but directly encapsulates it, then:
  • the index information in S202 may include: an index of a video stream corresponding to the to-be-played video content and an index of a segment in the video stream (eg, an index of each video stream and an index of each segment in each video stream).
  • S206 can be replaced with: the terminal determines the index of the first segment to which the starting image belongs based on the timestamp corresponding to the starting image and the index of the first target video stream; and based on the timestamp corresponding to the ending image and the first target video stream The index of the target video stream, which determines the index of the second segment to which the termination image belongs.
  • the request message in S207 may include: the index of the first target video stream, the index of the first segment, and the index of the second segment.
  • S209 specifically includes: the distribution-side system sends the first segment, the second segment, and the segment between the two segments to the terminal.
  • functional modules may be divided into a video encoding device (such as an encoder) and a video decoding device (such as a CDN system or a terminal) according to the foregoing method examples.
  • each functional module may be divided according to each function, or two One or more functions are integrated in one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • FIG. 14 shows a schematic structural diagram of a video encoding apparatus 140 provided by an embodiment of the present application.
  • the video encoding apparatus 140 can be used to perform any of the video encoding methods provided above, for example, to perform the steps performed by the encoder in the video encoding method shown in FIG. 8 .
  • the video encoding apparatus 140 may include: an acquisition unit 1401 , a generation unit 1402 and a compression unit 1403 .
  • the acquiring unit 1401 is configured to acquire multiple original video streams, wherein the multiple original video streams are obtained based on the video streams collected by multiple cameras for the same spatial region in the same time period.
  • the generating unit 1402 is configured to generate at least one target video stream according to the multiple original video streams, wherein the target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera according to the set direction .
  • a compression unit 1403, configured to compress the at least one target video stream.
  • the acquiring unit 1401 may be configured to perform the receiving action corresponding to S102
  • the generating unit 1402 may be configured to perform S103
  • the compressing unit 1403 may be configured to perform S105 .
  • the time stamps corresponding to the images in the target video stream are consecutive.
  • the multiple original video streams include a first original video stream, and the target video stream takes a first frame image in the first original video stream as a starting point.
  • the video encoding apparatus 140 further includes: a sending unit 1404 and an encapsulating unit 1405 .
  • the generating unit 1402 is further configured to generate an index of the at least one target video stream.
  • the index of the target video stream includes: the identification of the camera corresponding to the target video stream and the category of the target video stream, the camera corresponding to the target video stream is the camera corresponding to the first frame image in the target video stream, and the category of the video stream is used to represent the video
  • the stream is the original video stream or the destination video stream.
  • the sending unit 1404 is configured to send the index of the at least one target video stream.
  • the generating unit 1402 can be used to execute S106, and the sending unit 1404 can be used to execute S107.
  • the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
  • the encapsulation unit 1405 is configured to encapsulate the compressed at least one target video stream to obtain multiple segments.
  • the generating unit 1402 is further configured to generate indexes of the plurality of segments.
  • the sending unit 1404 is configured to send the indices of the multiple segments.
  • some or all of the functions implemented in the acquiring unit 1401 , the generating unit 1402 , the compression unit 1403 , and the encapsulating unit 1405 in the video encoding apparatus 140 may be executed by the processor 701 in FIG. 7 .
  • Program code in memory 702 is implemented.
  • the sending unit 1404 may be implemented by the sending unit in the communication interface 703 in FIG. 7 .
  • FIG. 15 shows a schematic structural diagram of a video playback apparatus 150 provided by an embodiment of the present application.
  • the video playback apparatus 150 can be used to perform any of the video playback methods provided above, for example, to perform the steps performed by the terminal in the video playback method shown in FIG. 11 .
  • the video playback device 150 may include: a receiving unit 1501, a decompressing unit 1502, and a playing unit 1503.
  • the receiving unit 1501 is configured to receive a video clip to be played, and the video clip to be played is selected from a first target video stream.
  • the first target video stream is a video stream obtained by selecting a certain number of frame images from original video streams corresponding to each of the plurality of cameras according to a set direction.
  • the multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period.
  • the decompression unit 1502 is configured to decompress the video segment to be played.
  • the playing unit 1503 is used to play the decompressed video segment. For example, referring to FIG.
  • the receiving unit 150 can be used to perform the receiving action corresponding to S209
  • the decompression unit 1502 can be used to perform the decompression step in S210
  • the playing unit 1503 can be used to perform the playing step in S210 .
  • the time stamps corresponding to the images in the first target video stream are consecutive.
  • the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
  • the video playback apparatus 150 further includes: a sending unit 1504 and a determining unit 1505 .
  • the sending unit 1504 is configured to send a request message, where the request message includes an index of the first target video stream.
  • the index of the first target video stream is used to determine the first target video stream.
  • the index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream.
  • the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream.
  • the sending unit 1504 can be used to perform S207.
  • the request message further includes the index of the target segment to which the video segment to be played belongs.
  • the receiving unit 1501 is specifically configured to: receive the target segment.
  • the determining unit 1505 is configured to determine the wraparound direction of the video clip to be played and the timestamp corresponding to the start image of the video clip to be played; and, based on the wraparound direction of the video clip to be played and the start of the video clip to be played The timestamp corresponding to the image determines the index of the first target video stream.
  • the wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played.
  • the first target video stream includes the image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
  • some or all of the functions implemented in the decompression unit 1502 and the determination unit 1505 in the video playback device 150 may be implemented by the processor 701 in FIG. 7 executing the program code in the memory 702 in FIG. 7 .
  • the receiving unit 1501 may be implemented by the receiving unit in the communication interface 703 in FIG. 7 .
  • the sending unit 1504 can be implemented by the sending unit in the communication interface 703 in FIG. 7 .
  • the playback unit 1503 can be implemented by a display screen, an audio input and output device, etc. (not shown in FIG. 7 ).
  • FIG. 16 shows a schematic structural diagram of a video playback apparatus 160 provided by an embodiment of the present application.
  • the video playback apparatus 160 can be used to perform any of the video playback methods provided above, for example, to perform the steps performed by the network device in the video playback method shown in FIG. 11 .
  • the video playback apparatus 160 may include: a determining unit 1601 and a sending unit 1602 .
  • the determining unit 1601 is configured to determine the video stream segment to be played. Wherein, the video clip to be played is selected from the first target video stream, and the first target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera in the plurality of cameras according to the set direction. The multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period.
  • the sending unit 1602 is configured to send the to-be-played video stream segment to the terminal. For example, with reference to FIG. 11 , the determining unit 1601 can be used to perform S208, and the sending unit 1602 can be used to perform S209.
  • the time stamps corresponding to the images in the first target video stream are consecutive.
  • the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
  • the video playback apparatus 160 may further include: a receiving unit 1603 .
  • the receiving unit 1603 is configured to receive a request message sent by the terminal, where the request message is used to request a video clip to be played.
  • the request message includes an index of the first target video stream
  • the index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream.
  • the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream.
  • the category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream.
  • the determining unit 1601 is further configured to determine the first target video stream based on the index of the first target video stream.
  • the sending unit 1602 is further configured to send the index of at least one target video stream to the terminal.
  • the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
  • the request message further includes an index of the target segment to which the video segment to be played belongs.
  • the determining unit 1601 is further configured to, based on the index of the target segment, determine the target segment from the first target video stream.
  • the sending unit 1602 is specifically configured to send the target segment to the terminal.
  • the sending unit 1602 is further configured to send the index of the segment in the at least one target video stream to the terminal.
  • the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
  • part or all of the functions implemented by the determining unit 1601 in the video playback device 160 may be implemented by the processor 701 in FIG. 7 executing the program codes in the memory 702 in FIG. 7 .
  • the sending unit 1602 may be implemented by the sending unit in the communication interface 703 in FIG. 7 .
  • the receiving unit 1603 may be implemented by the receiving unit in the communication interface 703 in FIG. 7 .
  • the embodiment of the present application also provides a video system, including a network device and a terminal.
  • a network device used for: acquiring multiple original video streams, which are obtained based on video streams collected by multiple cameras for the same spatial area in the same time period; and generating at least one target video based on the multiple original video streams
  • the target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction; and, compressing at least one target video stream.
  • the terminal is used for: receiving the video clip to be played from the network device, the video clip to be played is selected from the first target video stream in the compressed at least one target video stream; and decompressing the video clip to be played, and then playing the decompressed Video clips after.
  • the network device here may be the encoder described above. Reference may be made to the above with regard to other functions performed by the encoder, for example with reference to the embodiment shown in Figure 8 above.
  • the terminal may be the terminal described above. For other functions performed by the terminal, reference may be made to the above, for example, the embodiment shown in FIG. 11 above.
  • the network device may also be used to perform the steps performed by the distribution-side system in the embodiment shown in FIG. 11 above.
  • the video system may further include a distribution-side system, where the distribution-side system is used for sending the encoding result with the encoder, and distributing the to-be-played video segment to the terminal.
  • the distribution-side system is used for sending the encoding result with the encoder, and distributing the to-be-played video segment to the terminal.
  • Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program runs on a computer, the computer is made to execute any one of the above-mentioned methods.
  • Video encoding method or video playback method are stored on the computer-readable storage medium, and when the computer program runs on a computer, the computer is made to execute any one of the above-mentioned methods.
  • the embodiment of the present application also provides a chip.
  • the chip integrates a control circuit and one or more ports for realizing the functions of the video encoding device 140 , the video playing device 150 or the video playing device 160 .
  • the functions supported by the chip reference may be made to the above, which will not be repeated here.
  • the described program can be stored in a computer-readable storage medium.
  • the above-mentioned storage medium may be a read-only memory, a random access memory, or the like.
  • the above-mentioned processing unit or processor can be a central processing unit, a general-purpose processor, a specific integrated circuit (application specific integrated circuit, ASIC), a microprocessor (digital signal processor, DSP), a field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • FPGA field programmable gate array
  • the embodiments of the present application also provide a computer program product containing instructions, when the instructions are run on a computer, the instructions cause the computer to execute any one of the methods in the foregoing embodiments.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions according to the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g.
  • coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center.
  • Computer-readable storage media can be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc., that can be integrated with the media.
  • Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, SSDs), and the like.
  • the above-mentioned devices for storing computer instructions or computer programs provided in the embodiments of the present application are all non-transitory (non-transitory) .

Abstract

The present application relates to the technical field of video processing, and disclosed therein are a video encoding and video playback method, apparatus and system, which are applied to a surround playback scenario. The method comprises: acquiring a plurality of raw video streams obtained on the basis of video streams collected by a plurality of cameras in the same time period for the same spatial area, and generating at least one target video stream according to the plurality of raw video streams, the target video stream being a video stream obtained by selecting a specific number of image frames from among the raw video streams corresponding to each camera according to a set direction; and compressing the at least one target video stream. On the basis of the foregoing, in a video playback stage, the present application is beneficial for reducing the transmission code rate of a video stream.

Description

视频编码及视频播放方法、装置和系统Video coding and video playback method, device and system
本申请要求于2020年11月16日提交国家知识产权局、申请号为202011282060.9、申请名称为“视频编码及视频播放方法、装置和系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202011282060.9 and the application title "Video coding and video playback method, device and system", which was submitted to the State Intellectual Property Office on November 16, 2020, the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及视频处理技术领域,尤其涉及视频编码及视频播放方法、装置和系统。The present application relates to the technical field of video processing, and in particular, to a video coding and video playback method, device and system.
背景技术Background technique
5G场景下,观众追求更好的视频观看体验,希望看到视频中更多的细节,从而衍生出围绕目标对象360°环绕观看的需求。特别是在体育比赛、演唱会等有特定焦点的场景下,观众希望观看不同时刻的动态环绕播放画面。环绕播放是指以某一环绕方向播放在不同角度针对同一空间区域采集的图像。对于终端来说,向用户展示环绕播放的画面即按照环绕方向向用户展示各个机位的摄像机所采集的图像。In the 5G scenario, audiences pursue a better video viewing experience and hope to see more details in the video, which leads to the demand for 360° viewing around the target object. Especially in sports competitions, concerts and other scenes with specific focus, audiences want to watch dynamic surround playback images at different times. Surround playback refers to playback of images collected at different angles for the same spatial area in a certain surround direction. For the terminal, displaying the picture of the surround playback to the user is to show the user the images collected by the cameras of each camera position according to the surround direction.
在现有技术中,不同机位的摄像机所采集的视频流都是各自独立进行压缩的,因此在解压缩时也就无法参考其他机位的摄像机采集的图像。在环绕播放场景下,由于终端播放的连续多帧图像来自不同机位的摄像机,因此要解压缩某一帧图像就需要依靠该图像在对应的原始视频流中的其他图像,进而导致了视频流的传输码率较高。In the prior art, the video streams collected by cameras at different positions are independently compressed, so images collected by cameras at other positions cannot be referred to during decompression. In the surround playback scenario, since the continuous multi-frame images played by the terminal are from cameras at different positions, to decompress a certain frame of image, it is necessary to rely on other images of the image in the corresponding original video stream, which leads to the video stream The transmission bit rate is higher.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种视频编码及视频播放方法、装置和系统,有助于降低视频流的传输码率,可以应用于环绕播放场景中。Embodiments of the present application provide a video encoding and video playback method, device, and system, which help to reduce the transmission bit rate of a video stream, and can be applied to a surround playback scenario.
为了达到上述目的,本申请提供了以下技术方案:In order to achieve the above purpose, the application provides the following technical solutions:
第一方面,本申请提供了一种视频编码方法,该方法包括:首先,获取基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的多个原始视频流。其次,根据该多个原始视频流,生成至少一个目标视频流。目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。接着,压缩该至少一个目标视频流。该技术方案中,在编码阶段,按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像生成目标视频流,然后对目标视频流进行压缩。基于此,在环绕播放阶段下,可以直接向终端传输选自压缩后的目标视频流中的视频片段,即可使终端解压缩得到待播放视频片段,无需再依靠各个原始视频流的其他图像进行解码,有助于降低视频流的传输码率。In a first aspect, the present application provides a video encoding method. The method includes: first, acquiring multiple original video streams obtained based on video streams collected by multiple cameras for the same spatial area in the same time period. Second, at least one target video stream is generated according to the plurality of original video streams. The target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction. Next, the at least one target video stream is compressed. In the technical solution, in the encoding stage, a certain number of frame images are selected from the original video stream corresponding to each camera according to the set direction to generate the target video stream, and then the target video stream is compressed. Based on this, in the surround playback stage, the video clips selected from the compressed target video stream can be directly transmitted to the terminal, so that the terminal can decompress the video clips to be played without relying on other images of each original video stream. Decoding helps to reduce the transmission bit rate of the video stream.
在一种可能的设计中,目标视频流中的图像对应的时间戳连续。这样,有助于终端实现在该连续的多个时间戳内实时环绕播放,增加视频画面的流畅度。In a possible design, the time stamps corresponding to the images in the target video stream are consecutive. In this way, it is helpful for the terminal to realize real-time surround playback within the consecutive multiple time stamps, thereby increasing the smoothness of the video picture.
在一种可能的设计中,该多个原始视频流包括第一原始视频流,目标视频流以第一原始视频流中的第一帧图像为起点。这样,有助于终端实现从原始视频流中的第一帧图像开始环绕播放。In a possible design, the multiple original video streams include a first original video stream, and the target video stream takes a first frame image in the first original video stream as a starting point. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
在一种可能的设计中,该方法还包括:生成并发送该至少一个目标视频流的索引。其中,目标视频流的索引包括:目标视频流对应的摄像机的标识和目标视频流的类别。 目标视频流对应的摄像机是目标视频流中的第一帧图像对应的摄像机。其中,如果该第一帧图像由一个摄像机采集得到,或者由一个摄像机采集的图像经处理后得到,则第一帧图像与该摄像机对应。视频流的类别用于表征视频流是原始视频流或目标视频流。这样,当摄像机的数量较多时,有助于节省视频流的索引所占的存储空间。In a possible design, the method further includes: generating and sending an index of the at least one target video stream. The index of the target video stream includes: the identifier of the camera corresponding to the target video stream and the category of the target video stream. The camera corresponding to the target video stream is the camera corresponding to the first frame image in the target video stream. Wherein, if the first frame of image is acquired by a camera, or an image acquired by a camera is obtained after processing, the first frame of image corresponds to the camera. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. In this way, when the number of cameras is large, it helps to save the storage space occupied by the index of the video stream.
在一种可能的设计中,设定方向是该多个摄像机的环绕方向,该环绕方向包括顺时针方向或逆时针方向。In a possible design, the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
在一种可能的设计中,该方法还包括:将压缩后的该至少一个目标视频流进行封装,得到多个分段。然后,生成并发送该多个分段的索引。这样,在视频播放时,终端可以基于分段粒度获取待播放视频片段,而不必须基于分片粒度或视频流粒度获取待播放视频片段,这有助于节省传输资源。In a possible design, the method further includes: encapsulating the compressed at least one target video stream to obtain multiple segments. Then, the indices of the plurality of segments are generated and transmitted. In this way, during video playback, the terminal can acquire the video clips to be played based on the segment granularity, instead of having to acquire the to-be-played video clips based on the segment granularity or the video stream granularity, which helps to save transmission resources.
第二方面,本申请提供了一种视频播放方法,该方法包括:接收待播放视频片段,待播放视频片段选自第一目标视频流,第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。该多个摄像机对应的多个原始视频流是基于该多个摄像机同一时间段针对同一空间区域采集的视频流得到的。然后,对待播放视频片段进行解压缩,再播放解压缩之后的视频片段。由于终端的待播放视频片段选自第一目标视频流,因此,终端不需要再获取基于不同原始视频流进行压缩的图像。这样,有助于降低视频流的传输码率。In a second aspect, the present application provides a video playback method, the method comprising: receiving a video clip to be played, the video clip to be played is selected from a first target video stream, and the first target video stream is obtained from a plurality of cameras according to a set direction A video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera in . The multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period. Then, the video clip to be played is decompressed, and the decompressed video clip is played. Since the video clip to be played by the terminal is selected from the first target video stream, the terminal does not need to acquire images compressed based on different original video streams. In this way, it helps to reduce the transmission bit rate of the video stream.
在一种可能的设计中,第一目标视频流中的图像对应的时间戳连续。这样,有助于终端实现在连续的多个时间戳内实时环绕播放。In a possible design, the time stamps corresponding to the images in the first target video stream are consecutive. In this way, it is helpful for the terminal to realize real-time surround playback in multiple consecutive time stamps.
在一种可能的设计中,多个原始视频流包括第一原始视频流,第一目标视频流以第一原始视频流中的第一帧图像为起点。这样,有助于终端实现从原始视频流中的第一帧图像开始环绕播放。In a possible design, the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
在一种可能的设计中,在接收待播放视频片段之前,该方法还包括:发送请求消息,该请求消息用于请求待播放视频片段。In a possible design, before receiving the to-be-played video clip, the method further includes: sending a request message, where the request message is used to request the to-be-played video clip.
在一种可能的设计中,该请求消息包括第一目标视频流的索引。其中,第一目标视频流的索引用于确定第一目标视频流,第一目标视频流的索引包括第一目标视频流对应的摄像机的标识和第一目标视频流的类别。第一目标视频流对应的摄像机是第一目标视频流中的第一帧图像对应的摄像机。其中,如果该第一帧图像由一个摄像机采集得到,或者由一个摄像机采集的图像经处理后得到,则第一帧图像与该摄像机对应。视频流的类别用于表征视频流是原始视频流或目标视频流。该可能的设计提供了由终端发送第一目标视频流的索引的技术方案,由此可以得到,本技术方案中由终端确定第一目标视频流的索引,这样,有助于节省网络设备的计算资源。In a possible design, the request message includes an index of the first target video stream. The index of the first target video stream is used to determine the first target video stream, and the index of the first target video stream includes the identifier of the camera corresponding to the first target video stream and the category of the first target video stream. The camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream. Wherein, if the first frame of image is acquired by a camera, or an image acquired by a camera is obtained after processing, the first frame of image corresponds to the camera. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. This possible design provides a technical solution for the terminal to send the index of the first target video stream, thus it can be obtained that in this technical solution, the terminal determines the index of the first target video stream, which helps to save the calculation of the network device resource.
在一种可能的设计中,该方法还包括:接收至少一个目标视频流的索引。该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。其中,该至少一个目标视频流的索引可以是终端主动向网络设备请求的,也可以是网络设备主动推送给终端的。In a possible design, the method further includes: receiving an index of at least one target video stream. The at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream. The index of the at least one target video stream may be actively requested by the terminal from the network device, or may be actively pushed by the network device to the terminal.
在一种可能的设计中,该请求消息还包括待播放视频片段所属的目标分段的索引。该情况下,接收待播放视频流片段,包括:接收目标分段。这样,有助于进一步降低视频流的传输码率。In a possible design, the request message further includes an index of the target segment to which the video segment to be played belongs. In this case, receiving the video stream segment to be played includes: receiving the target segment. In this way, it helps to further reduce the transmission bit rate of the video stream.
在一种可能的设计中,该方法还包括:接收至少一个目标视频流中的分段的索引。其中,该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。其中,该至少一个目标视频流中的分段的索引可以是终端主动向网络设备请求的,也可以是网络设备主动推送给终端的。In one possible design, the method further includes: receiving an index of a segment in the at least one target video stream. Wherein, the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream. The index of the segment in the at least one target video stream may be actively requested by the terminal from the network device, or may be actively pushed by the network device to the terminal.
在一种可能的设计中,该方法还包括:确定待播放视频片段的环绕方向和待播放视频片段的起始图像对应的时间戳;然后,基于待播放视频片段的环绕方向和待播放视频片段的起始图像对应的时间戳,确定第一目标视频流的索引。In a possible design, the method further includes: determining the wraparound direction of the video clip to be played and a timestamp corresponding to the start image of the video clip to be played; then, based on the wraparound direction of the video clip to be played and the video clip to be played The timestamp corresponding to the starting image of the , determines the index of the first target video stream.
在一种可能的设计中,第一目标视频流对应的环绕方向与所述待播放视频片段的环绕方向相同;且第一目标视频流中包含当前播放的视频流中的第一图像的前一帧图像,第一图像对应的时间戳与该起始图像对应的时间戳相同。In a possible design, the wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played; and the first target video stream includes the previous image of the first image in the currently playing video stream Frame images, the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
第三方面,本申请提供了一种视频播放方法,该方法包括:确定待播放视频流片段,其中,待播放视频片段选自第一目标视频流,第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。该多个摄像机对应的多个原始视频流是基于该多个摄像机同一时间段针对同一空间区域采集的视频流得到的。然后,向终端发送该待播放视频流片段。In a third aspect, the present application provides a video playback method, the method comprising: determining a video stream segment to be played, wherein the to-be-played video segment is selected from a first target video stream, and the first target video stream is from a set direction from A video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera in the multiple cameras. The multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period. Then, the to-be-played video stream segment is sent to the terminal.
在一种可能的设计中,第一目标视频流中的图像对应的时间戳连续。In a possible design, the time stamps corresponding to the images in the first target video stream are consecutive.
在一种可能的设计中,该多个原始视频流包括第一原始视频流,第一目标视频流以第一原始视频流中的第一帧图像为起点。In a possible design, the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
在一种可能的设计中,在确定待播放视频流片段之前,该方法还包括:接收终端发送的请求消息,该请求消息用于请求待播放视频片段。In a possible design, before determining the to-be-played video stream segment, the method further includes: receiving a request message sent by the terminal, where the request message is used to request the to-be-played video segment.
在一种可能的设计中,该请求消息包括第一目标视频流的索引,第一目标视频流的索引包括第一目标视频流对应的摄像机的标识和第一目标视频流的类别。其中,第一目标视频流对应的摄像机是第一目标视频流中的第一帧图像对应的摄像机。视频流的类别用于表征视频流是原始视频流或目标视频流。该方法还包括:基于第一目标视频流的索引,确定第一目标视频流。In a possible design, the request message includes an index of the first target video stream, and the index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream. Wherein, the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. The method further includes: determining the first target video stream based on the index of the first target video stream.
在一种可能的设计中,该方法还包括:向终端发送至少一个目标视频流的索引;其中,该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。In a possible design, the method further includes: sending an index of at least one target video stream to the terminal; wherein the at least one target video stream is generated according to the multiple original video streams, and the at least one target video stream includes The first target video stream.
在一种可能的设计中,该请求消息还包括待播放视频片段所属的目标分段的索引。该方法还包括:基于目标分段的索引,从第一目标视频流中确定目标分段。该情况下,向终端发送待播放视频流片段,包括:向终端发送目标分段。In a possible design, the request message further includes an index of the target segment to which the video segment to be played belongs. The method also includes determining the target segment from the first target video stream based on the index of the target segment. In this case, sending the video stream segment to be played to the terminal includes: sending the target segment to the terminal.
在一种可能的设计中,该方法还包括:向终端发送至少一个目标视频流中的分段的索引。其中,该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。In a possible design, the method further includes: sending an index of a segment in the at least one target video stream to the terminal. Wherein, the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
第三方面提供的任一种视频播放方法中的相关内容的解释及有益效果可以参考第二方面对应的视频播放方法中的对应描述,此处不再赘述。For explanations and beneficial effects of the relevant content in any video playback method provided in the third aspect, reference may be made to the corresponding description in the video playback method corresponding to the second aspect, which will not be repeated here.
需要说明的是,执行上述第一方面提供的方法的网络设备与执行上述第三方面提供的方法的网络设备可以相同,也可以不同。It should be noted that the network device that executes the method provided by the first aspect may be the same as or different from the network device that executes the method provided by the third aspect.
第四方面,本申请提供了一种视频编码装置。该装置可以是芯片或者网络设备。In a fourth aspect, the present application provides a video encoding apparatus. The apparatus may be a chip or a network device.
在一种可能的设计中,该装置用于执行上述第一方面提供的任一种方法。本申请可以根据上述第一方面提供的方法,对该装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。示例性的,本申请可以按照功能将该装置划分为获取单元、生成单元和压缩单元等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第一方面中相应的技术方案,此处不再赘述。In a possible design, the apparatus is used to perform any one of the methods provided in the first aspect above. The present application may divide the device into functional modules according to the method provided in the first aspect. For example, each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. Exemplarily, the present application may divide the apparatus into an acquisition unit, a generation unit, a compression unit, and the like according to functions. For descriptions of possible technical solutions and beneficial effects performed by each of the above-divided functional modules, reference may be made to the corresponding technical solutions in the first aspect, which will not be repeated here.
在另一种可能的设计中,该装置包括:处理器,用于实现上述第一方面描述的任一种方法。该装置还可以包括存储器,存储器与处理器耦合,存储器用于存储计算机程序,处理器执行存储器中存储的计算机程序时,可以实现上述第一方面描述的任一种方法。该装置还可以包括通信接口,该通信接口用于该设备与其它设备进行通信,示例性的,通信端口可以是收发器、电路、总线、模块或其它类型的通信接口。本申请中存储器中的计算机程序可以预先存储也可以使用该装置时从互联网下载后存储,本申请对于存储器中计算机程序的来源不进行唯一限定。本申请实施例中的耦合是单元或模块之间的间接耦合或连接,其可以是电性,机械或其它的形式,用于单元或模块之间的信息交互。In another possible design, the apparatus includes: a processor for implementing any one of the methods described in the first aspect above. The apparatus may further include a memory, the memory is coupled to the processor, and the memory is used for storing a computer program. When the processor executes the computer program stored in the memory, any one of the methods described in the first aspect can be implemented. The apparatus may also include a communication interface for the device to communicate with other devices, for example, the communication port may be a transceiver, circuit, bus, module or other type of communication interface. The computer program in the memory in this application can be pre-stored or stored after being downloaded from the Internet when the device is used. This application does not uniquely limit the source of the computer program in the memory. The coupling in this embodiment of the present application is an indirect coupling or connection between units or modules, which may be in electrical, mechanical or other forms, and is used for information interaction between units or modules.
第五方面,本申请提供了一种视频播放装置。In a fifth aspect, the present application provides a video playback device.
在一种可能的设计中,该装置用于执行上述第二方面或第三方面提供的任一种方法。本申请可以根据上述第二方面或第三方面提供的方法,对该装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。In a possible design, the apparatus is used to perform any one of the methods provided in the second aspect or the third aspect. The present application may divide the device into functional modules according to the method provided in the second aspect or the third aspect. For example, each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
示例性的,当该装置执行第二方面提供的任一种方法时,该装置可以是终端,本申请可以按照功能将该装置划分为接收单元、解压缩单元和播放单元等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第二方面中相应的技术方案,此处不再赘述。Exemplarily, when the apparatus performs any one of the methods provided in the second aspect, the apparatus may be a terminal, and the present application may divide the apparatus into a receiving unit, a decompression unit, and a playing unit according to functions. For descriptions of possible technical solutions and beneficial effects performed by each of the above-divided functional modules, reference may be made to the corresponding technical solutions in the second aspect, which will not be repeated here.
当该装置执行第三方面提供的任一种方法时,该装置可以是芯片或网络设备,本申请可以按照功能将该装置划分为确定单元和发送单元等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第三方面中相应的技术方案,此处不再赘述。When the apparatus executes any of the methods provided in the third aspect, the apparatus may be a chip or a network device, and the present application may divide the apparatus into a determining unit and a sending unit according to functions. For descriptions of possible technical solutions and beneficial effects performed by each of the above-divided functional modules, reference may be made to the corresponding technical solutions in the above-mentioned third aspect, which will not be repeated here.
在另一种可能的设计中,该装置包括:处理器,用于实现上述第二方面或第三方面描述的任一种方法。该装置还可以包括存储器,存储器与处理器耦合,存储器用于存储计算机程序,处理器执行存储器中存储的计算机程序时,可以实现上述第二方面或第三方面描述的任一种方法。该装置还可以包括通信接口,该通信接口用于该设备与其它设备进行通信,示例性的,通信端口可以是收发器、电路、总线、模块或其它类型的通信接口。本申请中存储器中的计算机程序可以预先存储也可以使用该装置时从互联网下载后存储,本申请对于存储器中计算机程序的来源不进行唯一限定。本申请实施例中的耦合是单元或模块之间的间接耦合或连接,其可以是电性,机械或其它的形式,用于单元或模块之间的信息交互。In another possible design, the apparatus includes: a processor, configured to implement any one of the methods described in the second aspect or the third aspect. The apparatus may further include a memory coupled to the processor, where the memory is used to store a computer program, and when the processor executes the computer program stored in the memory, any one of the methods described in the second or third aspects above can be implemented. The apparatus may also include a communication interface for the device to communicate with other devices, for example, the communication port may be a transceiver, circuit, bus, module or other type of communication interface. The computer program in the memory in this application can be pre-stored or stored after being downloaded from the Internet when the device is used. This application does not uniquely limit the source of the computer program in the memory. The coupling in this embodiment of the present application is an indirect coupling or connection between units or modules, which may be in electrical, mechanical or other forms, and is used for information interaction between units or modules.
第六方面,本申请提供了一种计算机可读存储介质,如计算机非瞬态的可读存储介质。其上储存有计算机程序(或指令),当该计算机程序(或指令)在视频编码装 置上运行时,使得该视频编码装置执行上述第一方面提供的任一种方法。In a sixth aspect, the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium. A computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on the video encoding apparatus, the video encoding apparatus is made to execute any one of the methods provided in the first aspect.
第七方面,本申请提供了一种计算机可读存储介质,如计算机非瞬态的可读存储介质。其上储存有计算机程序(或指令),当该计算机程序(或指令)在视频播放装置上运行时,使得该视频播放装置执行上述第二方面或第三方面提供的任一种方法。In a seventh aspect, the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium. A computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on the video playback device, the video playback device is made to execute any one of the methods provided in the second aspect or the third aspect.
第八方面,本申请提供了一种计算机程序产品,当其在计算机上运行时,使得第一方面至第三方面提供的任一种方法被执行。In an eighth aspect, the present application provides a computer program product that, when executed on a computer, enables any one of the methods provided in the first to third aspects to be executed.
第九方面,本申请提供了一种芯片系统,包括:处理器,处理器用于从存储器中调用并运行该存储器中存储的计算机程序,执行第一方面至第三方面提供的任一种方法。In a ninth aspect, the present application provides a chip system, comprising: a processor, where the processor is configured to call and run a computer program stored in the memory from a memory, and execute any one of the methods provided in the first to third aspects.
第十方面,本申请提供了一种视频系统,包括:网络设备和终端。网络设备可以用于执行上述第一方面提供的任一种方法,终端用于执行上述第二方面提供的任一种方法。In a tenth aspect, the present application provides a video system, including: a network device and a terminal. The network device may be configured to execute any of the methods provided in the foregoing first aspect, and the terminal may be configured to execute any of the foregoing methods provided in the second aspect.
在一种可能的设计中,该网络设备还可以用于执行上述第三方面提供的任一种方法。或者,该视频系统还包括其他网络设备,用于执行上述第三方面提供的任一种方法。In a possible design, the network device may also be used to execute any one of the methods provided in the third aspect. Alternatively, the video system further includes other network devices for executing any of the methods provided in the third aspect.
可以理解的是,上述提供的任一种视频编码装置、视频播放装置、计算机存储介质、计算机程序产品或视频系统等均可以应用于上文所提供的对应的方法,因此,其所能达到的有益效果可参考对应的方法中的有益效果,此处不再赘述。It can be understood that any of the above-provided video encoding devices, video playback devices, computer storage media, computer program products or video systems can be applied to the corresponding methods provided above. For the beneficial effects, reference may be made to the beneficial effects in the corresponding method, which will not be repeated here.
在本申请中,上述视频编码装置和视频播放装置的名字对设备或功能模块本身不构成限定,在实际实现中,这些设备或功能模块可以以其他名称出现。只要各个设备或功能模块的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。In this application, the names of the above-mentioned video encoding apparatus and video playback apparatus do not limit the devices or functional modules themselves. In actual implementation, these devices or functional modules may appear in other names. As long as the functions of each device or functional module are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalents.
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。These and other aspects of the present application will be more clearly understood from the following description.
附图说明Description of drawings
图1为本申请实施例提供的一种摄像机的分布方式的示意图;FIG. 1 is a schematic diagram of a distribution manner of cameras according to an embodiment of the present application;
图2为本申请实施例提供的一种视频流、分片、分段、I帧和P帧之间的关系的示意图;2 is a schematic diagram of the relationship between a video stream, slice, segment, I frame and P frame provided by an embodiment of the present application;
图3为传统技术提供的一种环绕播放场景下,多个摄像机采集的视频流的示意图;3 is a schematic diagram of video streams collected by multiple cameras under a surround playback scenario provided by the conventional technology;
图4为传统技术基于图3提供的一种传输的视频流的示意图;Fig. 4 is the schematic diagram of the video stream of a kind of transmission that the conventional technology provides based on Fig. 3;
图5A为本申请实施例提供的一种视频系统的结构示意图;5A is a schematic structural diagram of a video system provided by an embodiment of the application;
图5B为本申请实施例提供的另一种视频系统的结构示意图;5B is a schematic structural diagram of another video system provided by an embodiment of the present application;
图5C为本申请实施例提供的另一种视频系统的结构示意图;5C is a schematic structural diagram of another video system provided by an embodiment of the application;
图5D为本申请实施例提供的另一种视频系统的结构示意图;5D is a schematic structural diagram of another video system provided by an embodiment of the application;
图6为本申请实施例提供的一种双焦点应用场景的示意图;FIG. 6 is a schematic diagram of a dual-focus application scenario provided by an embodiment of the present application;
图7为本申请实施例提供的一种计算机设备的结构示意图;7 is a schematic structural diagram of a computer device according to an embodiment of the present application;
图8为本申请实施例提供的一种视频编码方法的流程示意图;8 is a schematic flowchart of a video encoding method provided by an embodiment of the present application;
图9为本申请实施例提供的一种视频编码方法的过程示意图;9 is a schematic process diagram of a video encoding method provided by an embodiment of the present application;
图10为本申请实施例提供的一种封装格式的示意图;10 is a schematic diagram of an encapsulation format provided by an embodiment of the present application;
图11为本申请实施例提供的一种视频播放方法的流程示意图;11 is a schematic flowchart of a video playback method provided by an embodiment of the present application;
图12为本申请实施例提供的一种分发侧系统与终端之间传输的视频流和播放的 视频流的示意图;12 is a schematic diagram of a video stream transmitted between a distribution side system and a terminal and a played video stream provided by an embodiment of the application;
图13为本申请实施例提供的另一种分发侧系统与终端之间传输的视频流和播放的视频流的示意图;13 is a schematic diagram of another video stream transmitted between a distribution side system and a terminal and a played video stream provided by an embodiment of the present application;
图14为本申请实施例提供的一种视频编码装置的结构示意图;14 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application;
图15为本申请实施例提供的一种视频播放装置的结构示意图;FIG. 15 is a schematic structural diagram of a video playback device according to an embodiment of the application;
图16为本申请实施例提供的另一种视频播放装置的结构示意图。FIG. 16 is a schematic structural diagram of another video playback apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
首先,说明本申请涉及的部分术语和技术,以方便读者理解:First, some terms and technologies involved in this application are explained for the convenience of readers:
1)、环绕播放1), surround playback
环绕播放是指以某一环绕方向播放在不同角度针对同一空间区域采集的图像。环绕播放包括静止环绕播放和动态环绕播放。静止环绕播放是指以某一环绕方向播放同一时刻在不同角度针对同一空间区域采集的图像。动态环绕播放是指以某一环绕方向播放不同时刻在不同角度针对同一空间区域采集的图像。本申请所涉及的环绕播放是指动态环绕播放。Surround playback refers to playback of images collected at different angles for the same spatial area in a certain surround direction. Surround playback includes still surround playback and dynamic surround playback. Still surround playback refers to playback of images collected from the same spatial area at different angles at the same moment in a certain surround direction. Dynamic surround playback refers to the playback of images collected from the same spatial area at different times and at different angles in a certain surround direction. The surround playback involved in this application refers to dynamic surround playback.
环绕方向,是指以采集当前播放的图像所在的角度为基准的顺时针方向或逆时针方向。The surround direction refers to the clockwise or counterclockwise direction based on the angle at which the currently playing image is captured.
空间区域,是指环绕播放所针对的区域,通常是具有特定焦点的区域,如篮球赛场所在的空间区域,或者演唱会的舞台所在的空间区域等。The space area refers to the area targeted for surround playback, usually an area with a specific focus, such as the space area where the basketball arena is located, or the space area where the stage of the concert is located.
一方面,环绕播放要求采用特定方式分布的多个摄像机在不同角度针对同一空间区域采集的视频流,一个视频流包括连续的多帧图像。其中,该多个摄像机中的每个摄像机的视野范围与该空间区域存在重叠区域。不同摄像机的视野范围可以存在重叠区域,也可以不存在重叠区域。本申请实施例对该多个摄像机的分布方式不进行限定。例如,该多个摄像机可以以环形方式分布,如图1中的a图所示。或者,该多个摄像机可以以扇形方式分布,如图1中的b图所示。或者,该多个摄像机可以以直角(即90°)方式分布,如图1中的c图所示。或者,该多个摄像机可以以平角(即180°)方式(或直线方式)分布,如图1中的d图所示。On the one hand, surround playback requires video streams collected by multiple cameras distributed in a specific manner for the same spatial area at different angles, and one video stream includes consecutive multiple frames of images. Wherein, the field of view of each camera in the plurality of cameras has an overlapping area with the spatial area. The fields of view of different cameras may or may not have overlapping areas. This embodiment of the present application does not limit the distribution manner of the plurality of cameras. For example, the plurality of cameras may be distributed in an annular manner, as shown in a panel in FIG. 1 . Alternatively, the plurality of cameras may be distributed in a fan-shaped manner, as shown in b in FIG. 1 . Alternatively, the plurality of cameras may be distributed in a right-angle (ie, 90°) manner, as shown in panel c in FIG. 1 . Alternatively, the plurality of cameras may be distributed in a flat-angle (ie, 180°) manner (or a straight-line manner), as shown in d in FIG. 1 .
需要说明的是,在任一分布方式下,该多个摄像机可以均匀分布,也可以不均匀分布。图1中均是以该多个摄像机均匀分布为例进行说明的。It should be noted that, in any distribution manner, the plurality of cameras may be distributed evenly or unevenly. In FIG. 1 , the multiple cameras are uniformly distributed as an example for description.
作为示例,环绕方向,是基于该多个摄像机的分布方式,以采集“当前播放的图像”的摄像机(或者采集用于获得“当前播放的图像”的图像的摄像机)为基准的顺时针方向或逆时针方向。环绕播放,是依据该环绕方向依次播放该多个摄像机采集的图像(或基于该多个摄像机采集的图像经处理后得到的图像)。As an example, the surrounding direction is the clockwise direction based on the distribution of the plurality of cameras, taking the camera that captures the "currently playing image" (or the camera that captures the image used to obtain the "currently playing image") as the reference, or Counterclockwise. Surround playback is to sequentially play the images collected by the multiple cameras (or images obtained by processing based on the images collected by the multiple cameras) according to the surrounding direction.
另一方面,环绕播放要求基于相机同步技术,保证该多个摄像机在同一时间采集视频流。例如,基于相机同步技术,实现该多个摄像机均在某一时刻采集图像,且后续采集相邻两帧图像的时间间隔相同。关于“基于相机同步技术,以保证该多个摄像机在同一时间采集视频流”的具体实现方式可以参考现有技术。On the other hand, surround playback requires camera synchronization technology to ensure that the multiple cameras capture video streams at the same time. For example, based on the camera synchronization technology, it is realized that the multiple cameras all collect images at a certain moment, and the time interval for subsequent acquisition of two adjacent frames of images is the same. Regarding the specific implementation of "based on the camera synchronization technology to ensure that the multiple cameras capture video streams at the same time", reference may be made to the prior art.
2)、视频流、I帧、P帧、分片2), video stream, I frame, P frame, slice
视频流中的每帧图像对应一个时间戳,该时间戳用于该图像与其他视频流中的图像对齐,以及用于图像传输等。例如,假设环绕播放场景的采集侧系统部署有5个摄像机,每个摄像机采集的视频流均包括100帧图像,则按照采集的先后顺序,每个视频流中的图像可以依次对应时间戳1-100。Each frame of image in a video stream corresponds to a timestamp, which is used for aligning the image with images in other video streams, and for image transmission, etc. For example, assuming that there are 5 cameras deployed in the acquisition side system of the surrounding playback scene, and the video stream collected by each camera includes 100 frames of images, then according to the sequence of acquisition, the images in each video stream can correspond to timestamp 1- 100.
视频流中的图像可以包括I帧和P帧,I帧、P帧表示该帧图像的压缩方式。具体的:The images in the video stream may include I frames and P frames, and the I frames and P frames indicate the compression mode of the frame images. specific:
I帧表示关键帧。编码时,I帧属于帧内压缩编码,即基于本帧图像进行压缩,解码时,基于压缩后得到的数据即可解压缩,得到本帧图像。I-frames represent keyframes. During encoding, the I frame belongs to intra-frame compression encoding, that is, compression is performed based on the image of the current frame, and during decoding, the image of the current frame can be obtained by decompression based on the compressed data.
P帧表示本帧图像与本帧图像的前一帧图像(具体可以是I帧或P帧)之间的差别帧。编码时,P帧属于帧间压缩编码,即基于本帧图像和本帧图像的前一帧图像之间的差别进行压缩,解码时,基于压缩后得到的数据以及该前一帧图像进行解压缩,得到本帧图像。The P frame represents the difference frame between the image of the current frame and the image of the previous frame of the image of the current frame (specifically, it may be an I frame or a P frame). During encoding, the P frame belongs to inter-frame compression encoding, that is, compression is performed based on the difference between the image of this frame and the image of the previous frame of the image. When decoding, it is decompressed based on the data obtained after compression and the image of the previous frame. , to get the image of this frame.
一个视频流可以包括多个分片,每个分片包括一帧图像或连续的多帧图像。每个分片可以独立进行编解码,并独立进行传输。A video stream may include multiple slices, and each slice includes one frame of image or consecutive multiple frames of images. Each fragment can be encoded and decoded independently and transmitted independently.
示例的,一个分片中的第一帧图像是I帧,其他图像可以是I帧,也可以是P帧。通常,一个分片中的第一帧图像是I帧,其他图像是P帧。For example, the first frame of image in a slice is an I frame, and other images may be I frames or P frames. Typically, the first picture in a slice is an I-frame, and the other pictures are P-frames.
3)、分段3), segment
按照某种封装格式,对图像进行封装,得到分段。一个分段可以包括一帧图像或连续的多帧图像,每帧图像可以是I帧,也可以是P帧。一个分片可以包括多个分段。分段不能独立进行编解码,但是可以独立传输。According to a certain encapsulation format, the image is encapsulated to obtain segments. A segment may include one frame of image or consecutive multiple frames of images, and each frame of image may be an I frame or a P frame. A shard can contain multiple segments. Segments cannot be encoded and decoded independently, but they can be transmitted independently.
本申请实施例对封装方式不进行限定,例如封装方式可以包括chunk封装或输送流(transport stream,TS)封装等。This embodiment of the present application does not limit the encapsulation manner, for example, the encapsulation manner may include chunk encapsulation or transport stream (transport stream, TS) encapsulation, and the like.
如图2所示,为本申请实施例提供的一种视频流、分片、分段、I帧和P帧之间的关系的示意图。图2是以视频流包括多个分片,如分片1和分片2等,一个分片包括10帧图像,且每个分片中的第一帧图像是I帧,其他图像是P帧;一个分段包括一帧图像为例进行说明的。As shown in FIG. 2 , it is a schematic diagram of the relationship among a video stream, a slice, a segment, an I frame and a P frame according to an embodiment of the present application. Figure 2 shows that the video stream includes multiple slices, such as slice 1 and slice 2, etc. One slice includes 10 frames of images, and the first frame image in each slice is an I frame, and the other images are P frames ; A segment includes a frame of image as an example to illustrate.
4)、其他术语4), other terms
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.
在本申请实施例中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。In the embodiments of the present application, the terms "first" and "second" are only used for description purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
环绕播放场景下,终端播放的视频片段由多个摄像机所采集的图像重新组合而成。传统技术中,在编码阶段,不同摄像机所采集的视频流之间是独立进行压缩的,因此,在视频播放阶段,为了使终端正确解码得到重新组合而成的视频片段,在编码阶段,每个视频流中包含了多个图像组(group of pictures,GOP),其中,每个GOP的第一个图像是I帧,其他图像可以是P帧。I帧越多,传输码率越高。In the surround playback scenario, the video clips played by the terminal are recombined from images collected by multiple cameras. In the traditional technology, in the encoding phase, the video streams collected by different cameras are independently compressed. Therefore, in the video playback phase, in order to enable the terminal to correctly decode the recombined video clips, in the encoding phase, each A video stream contains multiple groups of pictures (GOPs), wherein the first picture of each GOP is an I frame, and other pictures can be P frames. The more I frames, the higher the transmission bit rate.
例如,如图3所示,假设按照顺时针部署的摄像机1-5分别采集得到视频流1-5,每个视频流包含100帧图像。编码阶段,编码器对这5个视频流分别进行压缩。播放阶段,假设终端从视频流1中的第1帧图像开始播放(即正常播放),且在播放视频流1中的第4帧图像时,终端在用户的指示下确定需要开始顺时针环绕播放,则环绕播放时,终端播放的视频片段依次包括:视频流2中的第5帧图像,视频流3中的第6帧图像,视频流4中的第7帧图像,视频流5中的第8帧图像,视频流1中的第9 帧图像,视频流2中的第10帧图像……如图4所示。For example, as shown in FIG. 3 , it is assumed that video streams 1-5 are acquired by cameras 1-5 deployed clockwise, and each video stream contains 100 frames of images. In the encoding stage, the encoder compresses the five video streams respectively. In the playback stage, it is assumed that the terminal starts playing from the first frame of video stream 1 (that is, normal playback), and when playing the fourth frame of video stream 1, the terminal determines that it needs to start clockwise playback under the user's instruction. , then during surround playback, the video clips played by the terminal sequentially include: the fifth frame in video stream 2, the sixth frame in video stream 3, the seventh frame in video stream 4, and the fifth frame in video stream 5. 8 frames of images, the 9th frame of images in video stream 1, the 10th frame of images in video stream 2... as shown in Figure 4.
参考图4,环绕播放时,终端播放的视频片段中的第5帧图像和第4帧图像选自不同的视频流。由于在编码阶段,编码器分别对同一个摄像机采集的视频流进行压缩,如图3所以,因此,环绕播放时,终端要正确解码得到第5帧图像时,不能参考播放的视频流中的第4帧图像,所以,在编码阶段,该第5帧图像需要被设置为I帧。同理,环绕播放阶段的后续其他图像也存在同样的问题。这就造成了传输码率高的问题。Referring to FIG. 4 , during surround playback, the fifth frame image and the fourth frame image in the video clip played by the terminal are selected from different video streams. In the encoding stage, the encoder compresses the video stream collected by the same camera respectively, as shown in Figure 3. Therefore, during surround playback, when the terminal needs to correctly decode the fifth frame of image, it cannot refer to the video stream in the playback video stream. 4 frames of images, so, in the encoding stage, the 5th frame of images needs to be set as I frame. Similarly, other images that follow in the surround playback stage have the same problem. This causes the problem of high transmission bit rate.
由于每个GOP的第一帧图像是I帧,其他图像是P帧,因此,如果一个GOP包含的图像比较多,在环绕播放的第一帧图像是一个GOP的非第一帧图像的情况下,终端需要依据该GOP的第一帧图像,才能依次解码得到环绕播放的第一帧图像,这会导致不能实时环绕播放。因此,为了实现实时环绕播放,传统技术中,GOP一般设置的很小,如一个GOP包括一帧或两帧图像,这会造成一个视频流中的I帧非常多,从而造成视频传输码率高的问题。例如,如果一个GOP包括5帧图像,那么,在环绕播放的第一帧图像是该GOP中的第5帧图像时,终端需要依据该GOP中的第一帧图像解码得到第2帧图像,基于第2帧图像解码得到第3帧图像,以此类推,基于第4帧图像解码得到第5帧图像,该过程耗费时间较长,因此不能实现实时环绕播放。Since the first frame image of each GOP is an I frame, and other images are P frames, if a GOP contains many images, the first frame image of the surround playback is a non-first frame image of a GOP. , the terminal needs to decode the first frame image of the GOP in sequence to obtain the first frame image of the surround playback, which will result in the inability to perform the surround playback in real time. Therefore, in order to realize real-time surround playback, in the traditional technology, the GOP is generally set very small. For example, a GOP includes one or two frames of images, which will cause a lot of I frames in a video stream, resulting in a high video transmission bit rate. The problem. For example, if a GOP includes 5 frames of images, then, when the first frame of images played around is the fifth frame of images in the GOP, the terminal needs to decode the first frame of images in the GOP to obtain the second frame of images. The second frame of image is decoded to obtain the third frame of image, and so on, the fifth frame of image is obtained by decoding the fourth frame of image, and the process takes a long time, so real-time surround playback cannot be realized.
为此,本申请实施例提供了一种视频编码方法和视频播放方法,可以应用于环绕播放场景,在编码阶段,网络设备按照设定方向从环绕播放场景下的每个摄像机对应的原始视频流中选择一定数量帧图像生成目标视频流,然后对目标视频流进行压缩。基于此,在环绕播放阶段,网络设备向终端传输选自压缩后的目标视频流中的视频片段,即可使终端解码得到待播放视频片段,而不需要再向终端传输基于不同原始视频流进行压缩的图像。因此,有助于降低视频流的传输码率。To this end, the embodiments of the present application provide a video encoding method and a video playback method, which can be applied to a surround playback scenario. In the encoding stage, the network device records the original video stream corresponding to each camera in the surround playback scenario according to a set direction. Select a certain number of frame images to generate the target video stream, and then compress the target video stream. Based on this, in the surround playback stage, the network device transmits the video clips selected from the compressed target video stream to the terminal, so that the terminal can decode the video clips to be played, and does not need to transmit the video clips based on different original video streams to the terminal. Compressed image. Therefore, it helps to reduce the transmission bit rate of the video stream.
进一步地,由于目标视频流由多个原始视频流中的图像组合而成,因此,对目标视频流的压缩,可以相当于,一个原始视频流中的图像可以参考另一个原始视频流中的图像进行压缩。相应的,终端可以基于一个原始视频流中的图像解码得到另一个原始视频流中的图像。相比传统技术,视频播放时传输的视频流中不需要包括很多I帧,从而有助于降低视频流的传输码率。Further, since the target video stream is composed of images in multiple original video streams, the compression of the target video stream can be equivalent to that an image in one original video stream can refer to an image in another original video stream. to compress. Correspondingly, the terminal can decode an image in one original video stream to obtain an image in another original video stream. Compared with the traditional technology, the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream.
如图5A所示,为本申请实施例提供的一种视频系统1的架构示意图。该视频系统1包括:采集侧系统10、编码器20,以及一个或多个终端(如终端30和终端31)。As shown in FIG. 5A , it is a schematic structural diagram of a video system 1 according to an embodiment of the present application. The video system 1 includes: an acquisition side system 10, an encoder 20, and one or more terminals (eg, a terminal 30 and a terminal 31).
采集侧系统10,包括多个摄像机。该多个摄像机基于特定方式(如图1所示的方式)分布部署于某一空间区域,以在不同角度采集针对该空间区域的视频流。The acquisition side system 10 includes a plurality of cameras. The multiple cameras are distributed and deployed in a certain spatial area based on a specific manner (as shown in FIG. 1 ), so as to collect video streams for the spatial area from different angles.
每个摄像机可以是定焦摄像机。该多个摄像机的焦点可以对应于同一焦点区域,也可以对应于多个焦点区域。也就是说,本申请实施例的环绕播放可以应用于单焦点场景中,也可以应用于多焦点场景(如双焦点场景)中。Each camera can be a fixed focus camera. The focal points of the multiple cameras may correspond to the same focus area, or may correspond to multiple focus areas. That is to say, the surround playback of the embodiments of the present application can be applied to a single-focus scene, and can also be applied to a multi-focus scene (eg, a bi-focus scene).
如图6所示,为一种双焦点应用场景的示意图。图6是针对一场篮球比赛而设置的场景,围绕篮球赛场部署多个摄像机,该多个摄像机呈环形分布,每个摄像机具有一个焦点,其中一部分摄像机的焦点在第一焦点区域,另一部分摄像机的焦点在第二焦点区域,从而构成双焦点场景。在一个示例中,通过人工调焦将其中一部分摄像机的焦点定位到篮球赛场的第一焦点区域,将另一部分摄像机的焦点定位到篮球赛场的第二焦点区域,并且通过相机同步技术实现该多个摄像机在某一时刻采集图像,且后续采集相邻两帧图像的时间间隔 相同。As shown in FIG. 6 , it is a schematic diagram of a dual-focus application scenario. Fig. 6 is a scene set for a basketball game. Multiple cameras are deployed around the basketball court. The multiple cameras are distributed in a ring shape. Each camera has a focal point. The focus of some cameras is in the first focus area, and the other cameras are in the first focus area. The focal point is in the second focal area, thus forming a bifocal scene. In one example, the focus of a part of the cameras is positioned to the first focus area of the basketball arena by manual focusing, and the focus of another part of the cameras is positioned to the second focus area of the basketball arena, and the camera synchronization technology is used to achieve the multiple The camera captures an image at a certain moment, and the time interval for subsequent capture of two adjacent frames of images is the same.
编码器20,用于执行本申请实施例提供的视频编码方法。其中,本申请实施例提供的视频编码方法的具体实现方式可以参考下文,例如参考图8所示的视频编码方法。The encoder 20 is configured to execute the video encoding method provided by the embodiment of the present application. The specific implementation of the video encoding method provided by the embodiments of the present application may refer to the following, for example, refer to the video encoding method shown in FIG. 8 .
终端,也可以被称为播放终端,用于解码并播放视频流中的视频片段。本申请实施例对终端的物理形态不进行限定,例如,可以是智能手机、平板电脑等。The terminal, which may also be referred to as a playback terminal, is used to decode and play video clips in the video stream. This embodiment of the present application does not limit the physical form of the terminal, for example, it may be a smart phone, a tablet computer, or the like.
编码器20是一个功能模块,其功能可以通过软件实现,或者通过硬件实现,或者通过软件结合硬件实现。The encoder 20 is a functional module, and its functions can be implemented by software, or by hardware, or by software combined with hardware.
图5A中是以采集侧系统10与编码器20独立设置为例进行说明的。在一些实施例中,采集侧系统10还可以包括与上述多个摄像机连接的控制节点,该控制节点可以对该多个摄像机进行控制。如图5B所示,编码器20可以集成在该控制节点中。当然,编码器20也可以独立于采集侧系统10,此时,该多个摄像机可以将采集到的视频流发送给控制节点,由控制节点将该视频流(或者将处理后的视频流)发送给编码器20。In FIG. 5A , it is illustrated by taking an example that the acquisition-side system 10 and the encoder 20 are independently installed. In some embodiments, the acquisition-side system 10 may further include a control node connected to the above-mentioned multiple cameras, and the control node may control the multiple cameras. As shown in Figure 5B, the encoder 20 may be integrated in the control node. Of course, the encoder 20 can also be independent of the acquisition-side system 10. At this time, the multiple cameras can send the captured video streams to the control node, and the control node sends the video streams (or the processed video streams) to the encoder 20.
在一些实施例中,编码器20还用于与终端通信,从而为终端提供编码后的视频流中的视频片段,以实现视频播放。In some embodiments, the encoder 20 is further configured to communicate with the terminal, so as to provide the terminal with video segments in the encoded video stream, so as to realize video playback.
在另一些实施例中,如图5C所示,视频系统1还可以包括:分发侧系统40。编码器20可以集成在分发侧系统40中,也可以与分发侧系统40独立设置。图5C中是以编码器20与分发侧系统40独立设置为例进行说明的。In other embodiments, as shown in FIG. 5C , the video system 1 may further include: a distribution side system 40 . The encoder 20 may be integrated in the distribution-side system 40 , or may be provided independently from the distribution-side system 40 . In FIG. 5C , the encoder 20 and the distribution-side system 40 are independently installed as an example for description.
基于图5C,编码器20还可以用于将编码结果发送给分发侧系统40。分发侧系统40用于与终端通信,从而为终端分发编码后的视频流中的视频片段,以实现视频播放。Based on FIG. 5C , the encoder 20 can also be used to send the encoding result to the distribution-side system 40 . The distribution-side system 40 is configured to communicate with the terminal, so as to distribute the video clips in the encoded video stream to the terminal, so as to realize video playback.
可选的,分发侧系统40可以是内容分发网络(content delivery network,CDN)。该CDN可以是传统技术中的任意一种CDN,当然不限于此。CDN的功能可以由一个服务器实现,也可以由多个服务器共同实现。可选的,分发侧系统40可以是一个或多个专用服务器,即为了实现本申请实施例提供的视频播放方法而专门设置的服务器或服务器集群。通常,相比使用专用服务器,使用CDN进行视频分发,能够节省成本。Optionally, the distribution-side system 40 may be a content delivery network (content delivery network, CDN). The CDN may be any kind of CDN in the conventional technology, but is of course not limited to this. The function of CDN can be implemented by one server or by multiple servers. Optionally, the distribution-side system 40 may be one or more dedicated servers, that is, servers or server clusters specially set up to implement the video playback method provided by the embodiments of the present application. In general, using a CDN for video distribution can save costs compared to using a dedicated server.
如果分发侧系统40是CDN,则该视频系统中还可以包括源站50。其中,源站50是CDN的数据来源。如图5D所示,编码器20可以集成在源站50中。当然,编码器20也可以独立于源站50,此时,编码器20可以将执行本申请实施例提供的视频编码方法的编码结果经源站50发送给CDN。源站50可以通过一个服务器实现,也可以通过多个服务器共同实现。If the distribution-side system 40 is a CDN, the video system may also include an origin station 50 . The source site 50 is the data source of the CDN. As shown in FIG. 5D , the encoder 20 may be integrated in the source station 50 . Of course, the encoder 20 may also be independent of the source station 50. In this case, the encoder 20 may send the encoding result of the video encoding method provided by the embodiment of the present application to the CDN via the source station 50. The source station 50 may be implemented by one server, or may be implemented by multiple servers together.
需要说明的是,上述图5A-图5D所示的视频系统均为可适用于本申请实施例的视频系统的示例,其不对本申请实施例提供的视频编码及视频播放方法所适用的视频系统构成限定。It should be noted that the above video systems shown in FIGS. 5A to 5D are all examples of the video systems applicable to the embodiments of the present application, and they are not applicable to the video systems to which the video encoding and video playback methods provided by the embodiments of the present application are applicable. constitute a limitation.
在硬件实现上,上述用于实现编码器20的功能的设备,用于实现分发侧系统40的功能的服务器,以及终端等均可以通过如图7所示的计算机设备70实现。In terms of hardware implementation, the above-mentioned device for realizing the function of the encoder 20, the server for realizing the function of the distribution side system 40, and the terminal can all be realized by the computer device 70 as shown in FIG. 7 .
如图7所示,计算机设备70可以用于实现本申请实施例提供的视频编码方法或者视频播放方法。例如,当计算机设备70是实现编码器20的功能的设备时,用于实现本申请实施例提供的视频编码方法,可选的,还用于实现本申请实施例提供的视频播放方法。又例如,当计算机设备70是实现分发侧系统40的功能的服务器或者终端时,用于实现本申请实施例提供的视频播放方法。As shown in FIG. 7 , a computer device 70 may be used to implement the video encoding method or the video playing method provided by the embodiments of the present application. For example, when the computer device 70 is a device that implements the functions of the encoder 20, it is used to implement the video encoding method provided by the embodiment of the present application, and optionally, it is also used to implement the video playback method provided by the embodiment of the present application. For another example, when the computer device 70 is a server or a terminal that implements the functions of the distribution-side system 40, it is used to implement the video playback method provided by the embodiment of the present application.
图7所示的计算机设备70可以包括:处理器701、存储器702、通信接口703以及总 线704。处理器701、存储器702以及通信接口703之间可以通过总线704连接。The computer device 70 shown in FIG. 7 may include a processor 701, a memory 702, a communication interface 703, and a bus 704. The processor 701 , the memory 702 and the communication interface 703 may be connected through a bus 704 .
处理器701是计算机设备70的控制中心,具体可以是一个通用中央处理单元(central processing unit,CPU),也可以是其他通用处理器等。其中,通用处理器可以是微处理器或者是任何常规的处理器等。The processor 701 is the control center of the computer device 70, and may specifically be a general-purpose central processing unit (central processing unit, CPU), or other general-purpose processors, or the like. Wherein, the general-purpose processor may be a microprocessor or any conventional processor or the like.
作为一个示例,处理器701可以包括一个或多个CPU,例如图7中所示的CPU 0和CPU 1。As an example, processor 701 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 7 .
存储器702可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。 Memory 702 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM) or other type of static storage device that can store information and instructions A dynamic storage device that can also be an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium, or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
在一种可能的实现方式中,存储器702可以独立于处理器701。存储器702可以通过总线704与处理器701相连接,用于存储数据、指令或者程序代码。处理器701调用并执行存储器702中存储的指令或程序代码时,能够实现本申请实施例提供的视频编码方法或视频播放方法。In one possible implementation, the memory 702 may be independent of the processor 701 . The memory 702 may be connected to the processor 701 through a bus 704 for storing data, instructions or program codes. When the processor 701 calls and executes the instructions or program codes stored in the memory 702, the video encoding method or the video playing method provided by the embodiments of the present application can be implemented.
在另一种可能的实现方式中,存储器702也可以和处理器701集成在一起。In another possible implementation manner, the memory 702 can also be integrated with the processor 701 .
通信接口703,用于计算机设备70与其他设备通过通信网络连接,该通信网络可以是以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口703可以包括用于接收数据的接收单元,以及用于发送数据的发送单元。The communication interface 703 is used for connecting the computer device 70 with other devices through a communication network, and the communication network can be an Ethernet, a radio access network (RAN), a wireless local area network (wireless local area networks, WLAN) and the like. The communication interface 703 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
总线704,可以是工业标准体系结构(industry standard architecture,ISA)总线、外部设备互连(peripheral component interconnect,PCI)总线或扩展工业标准体系结构(extended industry standard architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 704 may be an industry standard architecture (industry standard architecture, ISA) bus, a peripheral component interconnect (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
需要指出的是,图7中示出的结构并不构成对计算机设备70的限定,除图7所示部件之外,计算机设备70可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,当计算机设备70是终端时,该计算机设备70还可以包括显示屏、音频输入输出装置等,本申请实施例对此不作限制。It should be pointed out that the structure shown in FIG. 7 does not constitute a limitation on the computer device 70. In addition to the components shown in FIG. 7, the computer device 70 may include more or less components than those shown in the figure, or a combination of certain components may be included. some components, or a different arrangement of components. For example, when the computer device 70 is a terminal, the computer device 70 may further include a display screen, an audio input and output device, and the like, which is not limited in this embodiment of the present application.
以下,结合附图,说明本申请实施例提供的视频编码方法。该方法可以应用于上文提供的任一视频系统(如图5A-图5D任一附图所示的视频系统)中。Hereinafter, the video encoding method provided by the embodiments of the present application will be described with reference to the accompanying drawings. The method can be applied to any of the video systems provided above (such as the video system shown in any of Figures 5A-5D).
如图8所示,为本申请实施例提供的一种视频编码方法的流程示意图。本实施例中是以该方法应用于图5C所示的视频系统为例进行说明的。例如,本实施例中的多个摄像机可以是图5C中的采集侧系统10中的部分或全部摄像机,本实施例中的编码器和分发侧系统,分别可以是图5C中的编码器20和分发侧系统40。As shown in FIG. 8 , it is a schematic flowchart of a video coding method provided by an embodiment of the present application. This embodiment is described by taking the method applied to the video system shown in FIG. 5C as an example. For example, the multiple cameras in this embodiment may be some or all of the cameras in the acquisition-side system 10 in FIG. 5C , and the encoder and the distribution-side system in this embodiment may be the encoders 20 and 20 in FIG. 5C , respectively. Distribution side system 40 .
图8所示的方法可以包括以下S101-S108:The method shown in FIG. 8 may include the following S101-S108:
S101:多个摄像机同一时间段针对同一空间区域采集多个视频流。例如,一个摄像机在该时间段针对该空间区域采集一个视频流。S101: Multiple cameras collect multiple video streams for the same spatial area in the same time period. For example, a camera captures a video stream for the spatial region during the time period.
关于该多个摄像机的分布方式,空间区域等的解释均可以参考上文。Regarding the distribution manner of the plurality of cameras, the explanation of the spatial area and the like may refer to the above.
可以理解的是,摄像机在启动之后,可以源源不断地采集图像,S101中的同一时间段可以是摄像机采集图像的过程中的任意一个时间段。例如,摄像机可以将每个周期采集的视频流作为一个原始视频流。其中,该周期的时长与S101中的时间段的时长相等。It can be understood that, after the camera is started, it can continuously collect images, and the same time period in S101 can be any time period in the process of the camera collecting images. For example, the camera can take the video stream captured at each cycle as a raw video stream. Wherein, the duration of the cycle is equal to the duration of the time period in S101.
S102:该多个摄像机向编码器发送多个原始视频流。例如,一个摄像机对应一个原始视频流。一个摄像机对应的原始视频流可以是该摄像机在S101中所采集的视频流,或者是该摄像机在对S101中所采集的视频流进行处理后得到的视频流。S102: The multiple cameras send multiple original video streams to the encoder. For example, one camera corresponds to one raw video stream. The original video stream corresponding to one camera may be the video stream collected by the camera in S101, or the video stream obtained by the camera after processing the video stream collected in S101.
也就是说,本申请实施例中,将编码器接收到的视频流定义为原始视频流。That is to say, in this embodiment of the present application, the video stream received by the encoder is defined as the original video stream.
S103:编码器根据该多个原始视频流,生成至少一个目标视频流。其中,目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。S103: The encoder generates at least one target video stream according to the multiple original video streams. The target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction.
可选的,设定方向是该多个摄像机的环绕方向,如顺时针环绕方向或逆时针环绕方向。Optionally, the set direction is a surrounding direction of the plurality of cameras, such as a clockwise surrounding direction or a counterclockwise surrounding direction.
可选的,编码器在生成一个目标视频流的过程中,从不同原始视频流中选择的视频流的帧数可以相同,也可以不同。Optionally, in the process of generating a target video stream, the number of frames of video streams selected from different original video streams may be the same or different.
可选的,目标视频流中的图像对应的时间戳连续。这样,有助于终端实现在连续的多个时间戳内实时环绕播放。下文中的具体示例均是以目标视频流中的图像对应的时间戳连续为例进行说明的。在此统一说明,下文不再赘述。Optionally, the time stamps corresponding to the images in the target video stream are consecutive. In this way, it is helpful for the terminal to realize real-time surround playback in multiple consecutive time stamps. The following specific examples are all described by taking the continuous time stamps corresponding to the images in the target video stream as an example for description. Here, a unified description is provided, and details are not repeated below.
可选的,该多个原始视频流包括第一原始视频流,第一原始视频流可以是该多个原始视频流中的任意一个原始视频流。目标视频流以第一原始视频流中的第一帧图像为起点。这样,有助于终端实现从原始视频流中的第一帧图像开始环绕播放。Optionally, the multiple original video streams include a first original video stream, and the first original video stream may be any one of the multiple original video streams. The target video stream starts with the first frame image in the first original video stream. In this way, it is helpful for the terminal to realize surround playback from the first frame image in the original video stream.
可选的,每个原始视频流和每个目标视频流包括的图像的帧数相同。这样,结合上述可选的实现方式,有助于终端实现从原始视频流中的每一帧开始实时环绕播放。Optionally, each original video stream and each target video stream include the same number of frames of images. In this way, combined with the above-mentioned optional implementation manners, it is helpful for the terminal to realize real-time surround playback from each frame in the original video stream.
可选的,编码器基于不同原始视频流生成的目标视频流的个数相同。Optionally, the number of target video streams generated by the encoder based on different original video streams is the same.
可选的,S103可以包括:编码器针对多个原始视频流中的每个原始视频流,生成一个第一类视频流和一个第二类视频流。每个原始视频流、每个第一类视频流和每个第二类视频流包括的图像的帧数相同。第一原始视频流对应的第一类视频流是,以第一原始视频流中的第一帧图像为起点,按照该多个摄像机的第一环绕方向,从每个摄像机采集的原始视频流中选择第一数量帧图像得到的视频流。第一类视频流中的图像对应的时间戳连续。第一原始视频流对应的第二类视频流是,以第一原始视频流中的第一帧图像为起点,按照多个摄像机的第二环绕方向,从每个摄像机采集的原始视频流中选择第二数量帧图像得到的视频流。第二类视频流中的图像对应的时间戳连续。Optionally, S103 may include: for each original video stream in the multiple original video streams, the encoder generates a first-type video stream and a second-type video stream. Each of the original video streams, each of the first type of video streams, and each of the second type of video streams include the same number of frames of images. The first type of video stream corresponding to the first original video stream is, starting from the first frame image in the first original video stream, and according to the first surround direction of the plurality of cameras, from the original video stream collected by each camera. The video stream obtained by selecting the first number of frame images. The time stamps corresponding to the images in the first type of video stream are continuous. The second type of video stream corresponding to the first original video stream is, taking the first frame image in the first original video stream as a starting point, and selecting from the original video streams collected by each camera according to the second surround direction of the plurality of cameras. The video stream obtained from the second number of frame images. The time stamps corresponding to the images in the second type of video stream are continuous.
第一环绕方向与第二环绕方向相反。可选的,第一环绕方向是顺时针方向,第二环绕方向是逆时针方向。The first wraparound direction is opposite to the second wraparound direction. Optionally, the first surrounding direction is a clockwise direction, and the second surrounding direction is a counterclockwise direction.
该可选的实现方式,是以“编码器生成每个原始视频流对应的第一类视频流和第二类视频流”为例进行说明的。可选的,编码器也可以不生成第二类视频流或不生成第一类视频流。如果编码器生成第一类视频流,则终端可以执行基于第一环绕方向的环绕播放。如果编码器生成第二类视频流,则终端可以执行基于第二环绕方向的环绕播放。The optional implementation manner is described by taking "the encoder generates the first type video stream and the second type video stream corresponding to each original video stream" as an example. Optionally, the encoder may not generate the second type of video stream or the first type of video stream. If the encoder generates the first type of video stream, the terminal may perform surround playback based on the first surround direction. If the encoder generates the second type of video stream, the terminal may perform surround playback based on the second surround direction.
不同原始视频流对应的第一数量可以相等,也可以不相等。为了方便描述,下文中均以每个原始视频流对应的第一数量相等为例进行说明。一个原始视频流对应的第一数量,是指在生成第一类视频流的过程中,每次从该原始视频流中选择的图像的数量。The first numbers corresponding to different original video streams may or may not be equal. For the convenience of description, the following description is given by taking an example that the first numbers corresponding to each original video stream are equal to each other. The first quantity corresponding to an original video stream refers to the quantity of images selected from the original video stream each time in the process of generating the first type of video stream.
不同原始视频流对应的第二数量可以相等,也可以不相等。为了方便描述,下文中均以每个原始视频流对应的第二数量相等为例进行说明。一个原始视频流对应的第二数量,是指在生成第二类视频流的过程中,每次从该原始视频流中选择的图像的数量。The second numbers corresponding to different original video streams may or may not be equal. For the convenience of description, the following description is given by taking an example that the second numbers corresponding to each original video stream are equal. The second quantity corresponding to an original video stream refers to the quantity of images selected from the original video stream each time in the process of generating the second type of video stream.
另外,每个第一数量与每个第二数量可以相等,也可以不相等。为了方便描述,下文中均以每个第一数量与每个第二数量相等为例进行说明。In addition, each of the first quantities and each of the second quantities may or may not be equal. For the convenience of description, the following description is given by taking an example that each of the first quantities and each of the second quantities are equal.
第一数量和第二数量均可以是管理员输入的值。第一数量和第二数量是可以更新的。第一数量和第二数量均可以是大于等于1的整数。Both the first quantity and the second quantity may be values entered by the administrator. The first quantity and the second quantity can be updated. Both the first number and the second number may be an integer greater than or equal to 1.
如图9所示,假设采集侧系统包括按照顺时针方向依次排列的摄像机1-5,摄像机1-5对应的原始视频流分别为原始视频流1-5,每个原始视频流包括100帧图像,则编码器可以基于原始视频流i,生成第一类视频流i和第二类视频流i。其中,1≤i≤5,i是整数。假设第一环绕方向是顺时针方向,第二环绕方向是逆时针方向,第一数量和第二数量均是1,那么:As shown in Figure 9, it is assumed that the acquisition side system includes cameras 1-5 arranged in a clockwise direction, the original video streams corresponding to cameras 1-5 are original video streams 1-5 respectively, and each original video stream includes 100 frames of images , the encoder can generate the first type video stream i and the second type video stream i based on the original video stream i. Among them, 1≤i≤5, i is an integer. Assuming that the first wrapping direction is clockwise, the second wrapping direction is counterclockwise, and both the first and second numbers are 1, then:
第一类视频流1是以原始视频流1中的第一帧图像为起点,依次从原始视频流1、2、3、4、5、1、2、3、4、5……中选择1帧图像后得到的视频流,该视频流中的图像对应的时间戳连续。具体的,第一类视频流1中的图像依次为:原始视频流1中的第1帧图像,原始视频流2中的第2帧图像,原始视频流3中的第3帧图像,原始视频流4中的第4帧图像,原始视频流5中第5帧图像、原始视频流1中的第6帧图像、原始视频流2中的第7帧图像……原始视频流5中的第100帧图像。The first type of video stream 1 takes the first frame image in the original video stream 1 as the starting point, and selects 1 from the original video stream 1, 2, 3, 4, 5, 1, 2, 3, 4, 5... The video stream obtained after the frame image, the time stamps corresponding to the images in the video stream are continuous. Specifically, the images in the first type of video stream 1 are: the first frame image in the original video stream 1, the second frame image in the original video stream 2, the third frame image in the original video stream 3, the original video The 4th image in stream 4, the 5th image in original video stream 5, the 6th image in original video stream 1, the 7th image in original video stream 2...the 100th image in original video stream 5 frame image.
以此类推,编码器可以得到第一类视频流2-5,如图9所示。By analogy, the encoder can obtain the first type of video streams 2-5, as shown in FIG. 9 .
第二类视频流1是以原始视频流1中的第一帧图像为起点,依次从原始视频流1、5、4、3、2、1、5、4、3、2……中选择1帧图像后得到的视频流,该视频流中的图像对应的时间戳连续。具体的,第二类视频流1中的图像依次为:原始视频流1中的第1帧图像,原始视频流5中的第2帧图像,原始视频流4中的第3帧图像,原始视频流3中的第4帧图像,原始视频流2中第5帧图像、原始视频流1中的第6帧图像、原始视频流5中的第7帧图像……原始视频流2中的第100帧图像。The second type of video stream 1 takes the first frame image in the original video stream 1 as the starting point, and selects 1 from the original video stream 1, 5, 4, 3, 2, 1, 5, 4, 3, 2... The video stream obtained after the frame image, the time stamps corresponding to the images in the video stream are continuous. Specifically, the images in the second type of video stream 1 are sequentially: the first frame image in the original video stream 1, the second frame image in the original video stream 5, the third frame image in the original video stream 4, the original video The 4th image in stream 3, the 5th image in original video stream 2, the 6th image in original video stream 1, the 7th image in original video stream 5... The 100th image in original video stream 2 frame image.
以此类推,编码器可以得到第二类视频流2-5,如图9所示。By analogy, the encoder can obtain the second type of video streams 2-5, as shown in FIG. 9 .
作为示例,从一个原始视频流中选择一定数量帧图像,该数量可以认为是旋转间隔,即在环绕播放的过程中,一个视频流播放几帧后切换到另一个视频流。例如,基于图9的示例,在环绕播放的过程中,一个视频流播放播放1帧图像后切换到另一个视频流。As an example, a certain number of frames of images are selected from an original video stream, and the number can be considered as a rotation interval, that is, in the process of surround playback, one video stream switches to another video stream after playing a few frames. For example, based on the example of FIG. 9 , in the process of surround playback, one video stream is switched to another video stream after playing one frame of image.
S104:编码器生成该多个原始视频流和该至少一个目标视频流中每个视频流的索引。S104: The encoder generates an index of each video stream in the multiple original video streams and the at least one target video stream.
本申请实施例对视频流的索引的具体实现方式不进行限定。The embodiments of the present application do not limit the specific implementation manner of the index of the video stream.
可选的,一个视频流的索引包括该视频流对应的摄像机的标识和该视频流的类别。其中,原始视频流对应的摄像机是指采集原始视频流的摄像机,或者采集用于获得该原始视频流的摄像机。目标视频流对应的摄像机是指该目标视频流中的第一帧图像对应的摄像机。其中,如果该第一帧图像由一个摄像机采集得到或者由一个摄像机采集到的图像经处理后得到,则该第一帧图像与对摄像机对应。视频流的类别用于表征该视频流是原始视频流或目标视频流。这样,当摄像机的数量较多时,有助于节省视频流的索引所占的存储空间。Optionally, the index of a video stream includes the identifier of the camera corresponding to the video stream and the category of the video stream. The camera corresponding to the original video stream refers to the camera that collects the original video stream, or the camera that collects the original video stream for obtaining the original video stream. The camera corresponding to the target video stream refers to the camera corresponding to the first frame image in the target video stream. Wherein, if the first frame of image is acquired by a camera or obtained by processing an image acquired by a camera, the first frame of image corresponds to the pair of cameras. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. In this way, when the number of cameras is large, it helps to save the storage space occupied by the index of the video stream.
进一步可选的,如果目标视频流包括第一类视频流和第二类视频流,则视频流的类别可以用于表征该视频流是原始视频流、第一类视频流或第二类视频流。Further optionally, if the target video stream includes a first-type video stream and a second-type video stream, the category of the video stream can be used to characterize whether the video stream is an original video stream, a first-type video stream or a second-type video stream. .
例如,假设一个视频流的索引为[A,B],其中,A表示摄像机的标识,B表示视频流的类别,并且,视频流的类别为“00”、“01”和“10”,依次表示原始视频流、第一类视频流或第二类视频流;那么,基于图9所示的示例,各视频流的索引可以如表1所示:For example, suppose the index of a video stream is [A, B], where A represents the identification of the camera, B represents the category of the video stream, and the categories of the video stream are "00", "01" and "10", in order represents the original video stream, the first type of video stream or the second type of video stream; then, based on the example shown in Figure 9, the index of each video stream can be as shown in Table 1:
表1Table 1
Figure PCTCN2021130745-appb-000001
Figure PCTCN2021130745-appb-000001
可选的,编码器可以对该多个原始视频流和该多个目标视频中的每个视频流依次进行编号,并将每个视频流的编号作为该视频流的索引。例如,基于图9所示的示例,5个原始视频流、5个第一类视频流和5个第二类视频流,则这15个视频流的索引可以依次是1-15。Optionally, the encoder may sequentially number each video stream in the multiple original video streams and the multiple target videos, and use the number of each video stream as an index of the video stream. For example, based on the example shown in FIG. 9 , there are 5 original video streams, 5 first-type video streams and 5 second-type video streams, the indices of these 15 video streams may be 1-15 in sequence.
S105:编码器基于一个分片包括的图像的帧数,确定该多个原始视频流和该至少一个目标视频流中的I帧和P帧,并对确定I帧和P帧后的每个视频流进行压缩。S105: The encoder determines I frames and P frames in the multiple original video streams and the at least one target video stream based on the number of frames of images included in one slice, and determines the I frames and P frames for each video after determining the I frames and P frames. stream is compressed.
其中,一个分片包括的图像的帧数可以是预定义的,且预定义之后是可以更新的。The number of frames of an image included in a slice may be predefined, and may be updated after being predefined.
例如,假设一个视频流包括100帧图像,一个分片包括10帧图像,则由于一个分片中的第一帧图像是I帧,其他图像可以是I帧也可以是P帧,因此,在S105中,编码器可以将该视频流中的第1、11、21、31……91帧图像确定为I帧,其他图像确定为P帧。For example, assuming that a video stream includes 100 frames of images and a slice includes 10 frames of images, since the first frame of image in a slice is an I frame, other images can be either I frames or P frames. Therefore, in S105 , the encoder may determine the 1st, 11th, 21st, 31st, . . . 91 frames of images in the video stream as I frames, and other images as P frames.
S106:编码器分别对S105中得到的压缩后的多个原始视频流和压缩后的该至少一个目标视频流进行分片处理,得到多个分片,并生成每个分片的索引。S106: The encoder performs fragmentation processing on the compressed multiple original video streams and the compressed at least one target video stream obtained in S105, respectively, to obtain a plurality of fragments, and generates an index of each fragment.
可选的,同一时间戳对应的分片包括的图像的帧数相同。这有利于相对应的视频流在分片边界对齐,从而便于分片下载对齐。其中,分片对应的时间戳,可以用该分片包括的特定位置的图像对应的时间戳来表征,或者可以用该分片在其所属的视频流中的位置来表征等。Optionally, the slices corresponding to the same timestamp include the same number of frames of images. This facilitates the alignment of corresponding video streams at the segment boundary, thereby facilitating segment download alignment. The time stamp corresponding to the fragment may be represented by the time stamp corresponding to the image at a specific position included in the fragment, or by the position of the fragment in the video stream to which it belongs.
可选的,S106中所得到的每个分片所包括的图像的帧数相同,这样编码简单快捷。Optionally, the frames of the images included in each slice obtained in S106 are the same, so that the encoding is simple and fast.
本申请实施例对分片的索引的具体实现方式不进行限定。例如,分片的索引,可以包括该分片在其所属的压缩后的视频流中的索引。以一个分片包括10帧图像为例,图9中的每个视频流经压缩后可以分为10个分片,并且,对于每个压缩后的视频流来说,该视频流中的第1-10帧图像构成该视频流的分片1,第11-20帧图像构成该视频流的分片2……第91-100帧图像构成该视频流的分片10。每个压缩后的视频流所划分的分片的索引依次为1-10。This embodiment of the present application does not limit the specific implementation manner of the fragmented index. For example, the index of the slice may include the index of the slice in the compressed video stream to which it belongs. Taking a slice including 10 frames of images as an example, each video stream in Figure 9 can be divided into 10 slices after compression, and, for each compressed video stream, the first -10 frames of images constitute slice 1 of the video stream, 11-20 frames of images constitute slice 2 of this video stream... 91-100 frames of images constitute slice 10 of this video stream. The indices of the slices into which each compressed video stream is divided are in the order of 1-10.
需要说明的是,为了附图的方便和简洁,图9中是以视频流包含分片进行说明的,实际上,应该理解为,压缩后的视频流中包含分片。It should be noted that, for the convenience and brevity of the drawings, FIG. 9 illustrates that the video stream includes slices. In fact, it should be understood that the compressed video stream includes slices.
另外需要说明的是,由于每个分片可以独立进行编解码且独立进行传输,因此,编码器对视频流进行分片处理,则在视频播放时可以基于分片粒度传输和解码视频流,从而有助于节省传输资源。编码器对视频流进行分片处理是可选的步骤。如果编码器不对视频流进行分片处理,则编码器可以将每个原始视频流和目标视频流中的第一帧图像设置为I帧, 也可以使得终端对接收到的视频流进行正确解码。In addition, it should be noted that since each fragment can be independently encoded, decoded and transmitted independently, the encoder can perform fragmentation processing on the video stream, and the video stream can be transmitted and decoded based on the fragmentation granularity during video playback, thereby Helps save transmission resources. Fragmentation of the video stream by the encoder is an optional step. If the encoder does not perform slice processing on the video stream, the encoder can set the first frame of image in each original video stream and the target video stream as an I frame, and can also enable the terminal to correctly decode the received video stream.
S107:编码器将编码结果发送给分发侧系统。其中,编码结果包括压缩后的每个视频流,以及编码过程中产生的索引信息。该索引信息可以包括每个视频流的索引和每个视频流中的每个分片的索引。S107: The encoder sends the encoding result to the distribution-side system. The encoding result includes each compressed video stream and index information generated during the encoding process. The index information may include an index of each video stream and an index of each slice in each video stream.
可选的,编码器与分发侧系统之间可以采用通过互联网向用户提供各种应用服务(over the top,OTT)协议通信,当然不限于此。Optionally, various application services (over the top, OTT) protocol communication may be used between the encoder and the distribution side system to provide users with various application services through the Internet, which is of course not limited to this.
S108:分发侧系统基于编码结果,生成并存储“视频内容的标识、压缩后的视频流,以及索引信息”之间的对应关系。S108: Based on the encoding result, the distribution-side system generates and stores the correspondence between "the identifier of the video content, the compressed video stream, and the index information".
同一时间段在不同角度针对同一空间区域采集的视频流经压缩得到的视频流,整体对应一段视频内容,一段视频内容对应一个标识(即视频内容的标识)。不同的视频内容,对应不同时间段针对同一空间区域采集的视频流经压缩得到的视频流;或者,不同的视频内容,对应同一时间段(或不同时间段)针对不同空间区域采集的视频流经压缩得到的视频流。A video stream obtained by compressing video streams collected from different angles in the same spatial area in the same time period corresponds to a segment of video content as a whole, and a segment of video content corresponds to an identifier (ie, an identifier of the video content). Different video content corresponds to the video stream obtained by compressing the video stream collected for the same spatial area in different time periods; or, for different video content, corresponding to the video stream collected in the same time period (or different Compressed video stream.
例如,基于图9所示的示例,S108中存储的对应关系可以包括:“视频内容的标识,压缩后的15个视频流(即原始视频流1-5、第一类视频流1-5和第二类视频流1-5)、这15个视频流中每个视频流的索引,和这15个视频流中每个视频流的每个分片的索引”之间的对应关系。For example, based on the example shown in FIG. 9 , the corresponding relationship stored in S108 may include: “identification of video content, 15 compressed video streams (ie original video streams 1-5, first-type video streams 1-5 and The correspondence between the second type of video streams 1-5), the index of each video stream in the 15 video streams, and the index of each slice of each video stream in the 15 video streams”.
可选的,视频内容的标识,可以是统一资源定位符(uniform resource locator,URL),用于表征该视频内容对应的编码结果在分发侧系统中的存储位置。后续,终端可以基于该标识,向分发侧系统请求该视频内容。例如,基于图9所示的示例,该URL可以用于表征“压缩后的原始视频流1-5、压缩后的第一类视频流1-5、压缩后的第二类视频流1-5,以及这些视频流的索引,这些视频流中的分片的索引等”在分发侧系统中的存储位置。Optionally, the identifier of the video content may be a uniform resource locator (uniform resource locator, URL), which is used to represent the storage location of the encoding result corresponding to the video content in the distribution-side system. Subsequently, the terminal may request the video content from the distribution-side system based on the identifier. For example, based on the example shown in Figure 9, the URL can be used to represent "compressed original video streams 1-5, compressed first-type video streams 1-5, and compressed second-type video streams 1-5 , and the indices of these video streams, the indices of the slices in these video streams, etc." storage locations in the distribution-side system.
本申请实施例提供的视频编码方法中,按照设定方向,从每个摄像机采集的原始视频流中选择一定数量帧图像得到目标视频流,并分别对原始视频流和目标视频流进行压缩。编码器对目标视频流进行压缩,可以相当于,一个原始视频流中的图像可以参考另一个原始视频流中的图像进行压缩。这样,在环绕播放阶段,分发侧系统向终端提供的视频片段可以直接从目标视频流中选取得到。对于终端来讲,可以基于一个原始视频流中的图像解码得到另一个原始视频流中的图像。因此,相比传统技术来说,视频播放时传输的视频流中不需要包括很多I帧,从而有助于降低视频流的传输码率。In the video encoding method provided by the embodiment of the present application, according to the set direction, a certain number of frame images are selected from the original video stream collected by each camera to obtain the target video stream, and the original video stream and the target video stream are respectively compressed. The encoder compresses the target video stream, which can be equivalent to that an image in one original video stream can be compressed with reference to an image in another original video stream. In this way, in the surround playback stage, the video clips provided by the distribution-side system to the terminal can be directly selected from the target video stream. For a terminal, an image in another original video stream can be obtained by decoding based on an image in one original video stream. Therefore, compared with the conventional technology, the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream.
例如,基于下述图9和图12所示的示例可知,当原始视频流和目标视频流中的每个分片的第一帧图像设置为I帧,其他帧设置为P帧时,终端即可对接收到的视频流进行正确解码,其具体分析过程可以参考针对图9和图12的描述,此处不再赘述。For example, based on the examples shown in Figures 9 and 12 below, it can be known that when the first frame image of each fragment in the original video stream and the target video stream is set as I frame, and other frames are set as P frames, the terminal is The received video stream can be correctly decoded, and the specific analysis process can be referred to the description for FIG. 9 and FIG. 12 , and details are not repeated here.
另外,本申请实施例提供的视频编码方法不需要将多个视频流合并成一路高分辨率的画面,因此,摄像机的个数和画面分辨率不受限制。也就是说,使用本技术方案进行视频编码后,在视频播放阶段,可以在保证画面分辨率的情况下,降低视频流的传输码率。In addition, the video coding method provided by the embodiment of the present application does not need to combine multiple video streams into one high-resolution picture, so the number of cameras and picture resolution are not limited. That is to say, after using the technical solution for video encoding, in the video playback stage, the transmission bit rate of the video stream can be reduced under the condition of ensuring the picture resolution.
在一些实施例中,可选的,编码器可以不对压缩后的多个原始视频流和压缩后的至少一个目标视频流进行分片。基于此,上述S105-S106可以替换为以下步骤1-2:In some embodiments, optionally, the encoder may not perform fragmentation on the compressed multiple original video streams and the compressed at least one target video stream. Based on this, the above S105-S106 can be replaced with the following steps 1-2:
步骤1:编码器将每个原始视频流和每个目标视频流中的第一帧图像确定为I帧,其他图像确定为I帧或P帧(如其他图像均确定为P帧),对确定了I帧和P帧后的每个视频流进 行压缩。Step 1: The encoder determines the first frame image in each original video stream and each target video stream as an I frame, and other images are determined as I frames or P frames (for example, other images are determined as P frames), to determine Each video stream after I-frame and P-frame is compressed.
步骤2:编码器分别对压缩后的每个原始视频流和压缩后的每个目标视频流进行封装,得到多个分段,并生成每个分段的索引。Step 2: The encoder separately encapsulates each compressed original video stream and each compressed target video stream to obtain multiple segments, and generates an index of each segment.
基于此,S107和S108中的索引信息可以包括:每个视频流的索引和每个视频流中的每个分段的索引。Based on this, the index information in S107 and S108 may include: the index of each video stream and the index of each segment in each video stream.
也就是说,本实施例中,编码器可以不对压缩后的视频流进行分片处理,而直接对其进行封装。这是在考虑到分段可以独立进行传输而提供的技术方案。That is to say, in this embodiment, the encoder may not perform fragmentation processing on the compressed video stream, but directly encapsulate it. This is a technical solution provided considering that segments can be transmitted independently.
在另一些实施例中,在执行S106之前,编码器可以对S105中得到的压缩后的多个原始视频流和压缩后的至少一个目标视频流中的图像进行封装。基于此,S106可以包括:编码器对执行封装后的多个原始视频流和至少一个目标视频流进行分片处理。In other embodiments, before executing S106, the encoder may encapsulate images in the compressed multiple original video streams and the compressed at least one target video stream obtained in S105. Based on this, S106 may include: the encoder performs fragmentation processing on the multiple original video streams and at least one target video stream after performing encapsulation.
由于分段可以独立进行传输,因此,在视频播放时,终端可以基于分段粒度向分发侧系统获取待播放视频片段,而不必须基于分片粒度或视频流粒度向分发侧系统获取待播放视频片段,从而有助于节省传输资源。Since segments can be transmitted independently, during video playback, the terminal can obtain video segments to be played from the distribution-side system based on the segment granularity, instead of obtaining the to-be-played video from the distribution-side system based on the segment granularity or video stream granularity fragments, thereby helping to save transmission resources.
其中,一个分段包括的图像的帧数可以是预定义的,且预定义之后是可以更新的。一个分片可以包括一个或多个分段。The number of frames of an image included in a segment may be predefined, and may be updated after being predefined. A slice can consist of one or more segments.
可选的,为实现单帧旋转播放(即实时环绕播放),即在播放视频流的过程中,终端当前播放任一视频流的任一帧图像时,均可以开始环绕播放,编码器可以采用分段方式将压缩后的多个原始视频流、压缩后的至少一个目标视频流中的每个视频流中的每帧图像分别封装为一个分段。Optionally, in order to realize single-frame rotation playback (that is, real-time surround playback), that is, in the process of playing the video stream, when the terminal is currently playing any frame of any video stream, the surround playback can be started, and the encoder can use The segmentation method encapsulates each frame of image in each of the compressed multiple original video streams and the compressed at least one target video stream into a segment respectively.
示例的,一种封装格式如图10所示。图10中示意出了基于fmp4chunk方式对属于一个分片中的图像进行封装的示例。具体的,将该分片中的每帧图像封装到独立的mdat中,并使用多moof头方式封装。其中,moof头包括styp、sidx和moof几部分。这样,视频流中的每帧图像可以作为分段独立传输。分片中的第一帧图像是I帧,其他图像是P帧。An example of an encapsulation format is shown in FIG. 10 . FIG. 10 illustrates an example of encapsulating images belonging to one slice based on the fmp4chunk method. Specifically, each frame of the image in the slice is encapsulated into an independent mdat, and encapsulated in a multi-moof header manner. Among them, the moof header includes styp, sidx and moof. In this way, each frame of image in the video stream can be transmitted independently as a segment. The first image in the slice is an I frame, and the other images are P frames.
可选的,如果编码器对S105中得到的压缩后的多个原始视频流和压缩后的至少一个目标视频流中的图像进行封装,则S107中的索引信息还可以包括每个分片中的每个分段的索引。其中,本申请实施例对分段的索引的具体实现方式不进行限定。例如,分段的索引,可以包括该分段在其所属的分片中的索引。以一个分片包括5个分段为例,每个分片中的分段的索引可以为1-5。Optionally, if the encoder encapsulates the images in the compressed multiple original video streams and the compressed at least one target video stream obtained in S105, the index information in S107 may also include the The index of each segment. Wherein, the embodiment of the present application does not limit the specific implementation manner of the segmented index. For example, the index of the segment may include the index of the segment in the shard to which it belongs. Taking a shard including 5 segments as an example, the indices of the segments in each shard may be 1-5.
也就是说,本实施例中,编码器既可以既对压缩后的视频流进行分片处理,又进行封装。由于每个分片的第一帧图像是I帧,而不进行分片处理的视频流通常仅第一帧图像是I帧,因此,相比“不进行分片处理,仅进行封装”的技术方案,在视频播放阶段,可以在一定程度上降低冗余图像的传输。That is to say, in this embodiment, the encoder can perform both fragmentation processing and encapsulation on the compressed video stream. Since the first frame image of each fragment is an I frame, and the video stream without fragmentation processing is usually only the first frame image is an I frame, therefore, compared with the technology of "no fragmentation processing, only encapsulation" The scheme, in the video playback stage, can reduce the transmission of redundant images to a certain extent.
以下,结合附图,说明本申请实施例提供的视频播放方法。Hereinafter, the video playback method provided by the embodiments of the present application will be described with reference to the accompanying drawings.
如图11所示,为本申请实施例提供的一种视频播放方法的流程示意图。本实施例中是以该方法应用于图5C所示的视频系统为例进行说明的。例如,本实施例中的终端可以是图5C中的终端30或终端31。本实施例中的分发侧系统可以是5C中的分发侧系统40。As shown in FIG. 11 , it is a schematic flowchart of a video playing method according to an embodiment of the present application. This embodiment is described by taking the method applied to the video system shown in FIG. 5C as an example. For example, the terminal in this embodiment may be the terminal 30 or the terminal 31 in FIG. 5C . The distribution-side system in this embodiment may be the distribution-side system 40 in 5C.
图11所示的方法可以包括以下S201-S210:The method shown in FIG. 11 may include the following S201-S210:
S201:终端获取待播放视频内容的标识。S201: The terminal acquires the identifier of the video content to be played.
待播放视频内容指的是用户期望播放的视频内容,例如一场足球赛,或者一场演唱会。 S201中的待播放视频内容可以是上述S108中的视频内容,如基于图9所示的示例,该视频内容包括:压缩后的15个视频流(即原始视频流1-5、第一类视频流1-5和第二类视频流1-5)、这15个视频流中每个视频流的索引,和这15个视频流中每个视频流的每个分片的索引。The video content to be played refers to the video content that the user desires to play, such as a football game or a concert. The video content to be played in S201 may be the video content in the above-mentioned S108. Based on the example shown in FIG. 9, the video content includes: 15 compressed video streams (that is, the original video streams 1-5, the first type of video Streams 1-5 and the second type of video streams 1-5), the index of each of the 15 video streams, and the index of each slice of each of the 15 video streams.
本申请实施例对待播放视频内容标识的具体形式不进行限定,例如,待播放视频内容的标识可以是待播放视频内容的URL。The embodiment of the present application does not limit the specific form of the identifier of the video content to be played. For example, the identifier of the video content to be played may be a URL of the video content to be played.
本申请实施例对获取待播放视频内容标识的方式不做具体限定。例如,用户可以通过点击终端屏幕上的视频链接选择待播放视频内容,终端接收用户的操作指令并基于该操作指令获取待播放视频内容的URL。又如,分发侧系统可以主动向终端推送一个或多个视频内容的URL,其中,该一个或多个视频内容包含待播放视频内容的URL。This embodiment of the present application does not specifically limit the manner of acquiring the identifier of the video content to be played. For example, the user can select the video content to be played by clicking a video link on the terminal screen, and the terminal receives the user's operation instruction and obtains the URL of the to-be-played video content based on the operation instruction. For another example, the distribution-side system may actively push URLs of one or more video contents to the terminal, where the one or more video contents include URLs of the video contents to be played.
S202:终端基于待播放视频内容的标识,向分发侧系统获取待播放视频内容对应的索引信息。该索引信息可以包括待播放视频内容对应的视频流的索引和视频流中的分片的索引(如每个视频流的索引和每个视频流中的每个分片的索引)。S202: Based on the identifier of the video content to be played, the terminal obtains index information corresponding to the video content to be played from the distribution side system. The index information may include an index of a video stream corresponding to the video content to be played and an index of a segment in the video stream (eg, an index of each video stream and an index of each segment in each video stream).
以终端与分发侧系统之间基于OTT通信为例,终端可以向分发侧系统发送请求,该请求中携带待播放视频内容的标识,用于请求待播放视频内容对应的索引信息。分发侧系统基于待播放视频内容的标识,获取并向终端反馈待播放视频内容对应的索引信息。当然,分发侧系统也可以主动向终端推送待播放视频内容对应的索引信息。Taking OTT-based communication between the terminal and the distribution-side system as an example, the terminal may send a request to the distribution-side system, where the request carries the identifier of the video content to be played, and is used to request index information corresponding to the video content to be played. The distribution-side system acquires and feeds back index information corresponding to the to-be-played video content to the terminal based on the identifier of the to-be-played video content. Of course, the distribution-side system can also actively push the index information corresponding to the video content to be played to the terminal.
S203:终端向分发侧系统请求待播放视频内容的初始视频流,并解压缩所请求的初始视频流,播放解压缩后的初始视频流。S203: The terminal requests an initial video stream of the video content to be played from the distribution side system, decompresses the requested initial video stream, and plays the decompressed initial video stream.
初始视频流可以是预定义的。示例性的,初始视频流可以是待播放视频内容对应的一个原始视频流,其具体可以管理员指定的,也可以由分发侧系统根据索引的大小来确定。The initial video stream can be predefined. Exemplarily, the initial video stream may be an original video stream corresponding to the to-be-played video content, which may be specified by the administrator or determined by the distribution-side system according to the size of the index.
具体的,S203可以包括:以终端与分发侧系统之间基于分片粒度传输视频流为例,终端向分发侧系统请求待播放视频内容的初始视频流中的一部分分片,并解压缩所请求到的分片,播放解压缩后的分片。Specifically, S203 may include: taking the transmission of video streams between the terminal and the distribution-side system based on fragmentation granularity as an example, the terminal requests the distribution-side system for a part of the fragments in the initial video stream of the video content to be played, and decompresses the requested To the fragment, play the decompressed fragment.
示例的,终端依次请求待播放视频内容的初始视频流中的第一个分片、第二个分片……直到终端确定需要开始执行环绕播放为止。例如,直到终端接收到S204中的第一操作为止,也就是说,该示例中,终端接收到S204中的第一操作之前,顺序播放初始视频流(即正常播放阶段)。其中,终端接收到S204中的第一操作,说明终端具有开始执行环绕播放该视频内容的需求。例如,假设初始视频流是原始视频流1,则终端可以依次向分发侧系统请求压缩后的原始视频流1中的分片1、分片2、分片3……,并解压缩所请求到的分片,以及播放解压缩后的分片,直到终端执行S204时,停止获取原始视频流1中的分片。For example, the terminal sequentially requests the first fragment, the second fragment in the initial video stream of the video content to be played, ... until the terminal determines that the surround playback needs to be started. For example, until the terminal receives the first operation in S204, that is, in this example, before the terminal receives the first operation in S204, the initial video stream is played in sequence (ie, the normal playing stage). Wherein, the terminal receives the first operation in S204, indicating that the terminal has a requirement to start performing surround playback of the video content. For example, assuming that the initial video stream is original video stream 1, the terminal can sequentially request slice 1, slice 2, slice 3... in the compressed original video stream 1 from the distribution-side system, and decompress the requested and play the decompressed segments until the terminal executes S204, and stops acquiring the segments in the original video stream 1.
需要说明的是,S203是可选的步骤。不执行S203的技术方案,可以理解为:终端直接环绕播放待播放视频内容对应的视频流。执行S203的技术方案,可以理解为:终端先顺序播放(或正常播放)待播放视频内容对应的视频流,再环绕播放待播放视频内容对应的视频流。It should be noted that S203 is an optional step. If the technical solution of S203 is not performed, it can be understood that: the terminal directly plays the video stream corresponding to the video content to be played in a surround. Executing the technical solution of S203 can be understood as: the terminal first plays (or normally plays) the video stream corresponding to the video content to be played, and then plays the video stream corresponding to the video content to be played in a loop.
S204:终端接收第一操作,响应于第一操作,终端获取待播放视频片段的环绕方向、待播放视频片段的起始图像对应的时间戳和待播放视频片段的终止图像对应的时间戳。S204: The terminal receives the first operation, and in response to the first operation, the terminal acquires the wrapping direction of the video clip to be played, the time stamp corresponding to the start image of the video clip to be played, and the time stamp corresponding to the end image of the video clip to be played.
也就是说,终端在用户的指示下,开始执行环绕播放。当然本申请实施例不限于此,例如,终端可以在播放到某一视频流中的预设图像时,开始执行环绕播放。That is to say, the terminal starts to perform surround playback under the instruction of the user. Of course, the embodiment of the present application is not limited to this. For example, the terminal may start to perform surround playback when a preset image in a certain video stream is played.
本申请实施例不限定第一操作的具体实现方式,例如,第一操作可以是触摸操作、按 压操作或语音操作,或者是上述至少两种操作的结合等。以下列举一些可能的实现方式:The embodiments of the present application do not limit the specific implementation manner of the first operation. For example, the first operation may be a touch operation, a pressing operation, or a voice operation, or a combination of at least two operations described above. Here are some possible implementations:
在一种实现方式中,以终端是智能手机、平板电脑等带有触摸屏的终端为例,终端接收用户针对触摸屏上的旋转控件的触摸操作。响应于该触摸操作,终端确定待播放视频片段的环绕方向、待播放视频片段的起始图像对应的时间戳和终止图像对应的时间戳。In an implementation manner, taking the terminal as a terminal with a touch screen such as a smart phone and a tablet computer as an example, the terminal receives a user's touch operation on a rotating control on the touch screen. In response to the touch operation, the terminal determines the wrapping direction of the video clip to be played, the time stamp corresponding to the start image and the time stamp corresponding to the end image of the video clip to be played.
例如,触摸操作是向第一方向(如向左)滑动旋转控件时,表示待播放视频片段的环绕方向是顺时针方向。触摸操作是向第二方向(如向右)滑动旋转控件时,表示待播放视频片段的环绕方向是逆时针方向。也就是说,针对同一旋转控件的不同方向的触摸操作,指示不同的环绕方向。For example, when the touch operation is to slide the rotation control in the first direction (eg, to the left), it means that the wrapping direction of the video clip to be played is clockwise. When the touch operation is to slide the rotation control in the second direction (eg, rightward), it indicates that the wrapping direction of the video clip to be played is counterclockwise. That is, for touch operations in different directions of the same rotating control, different wrapping directions are indicated.
又如,触摸操作是针对第一旋转控件的触摸操作时,表示待播放视频片段的环绕方向是顺时针方向。触摸操作是向第二旋转控件的触摸操作时,表示待播放视频片段的环绕方向是逆时针方向。也就是说,针对不同旋转控件的触摸操作,指示不同的环绕方向。For another example, when the touch operation is a touch operation on the first rotation control, it means that the wrapping direction of the video clip to be played is clockwise. When the touch operation is a touch operation to the second rotation control, it indicates that the wrapping direction of the video clip to be played is a counterclockwise direction. That is, for touch operations of different rotating controls, different wrapping directions are indicated.
可选的,终端通过触摸操作的开始触摸时间,确定待播放视频片段的起始图像对应的时间戳。例如,在不考虑延迟的情况下,终端可以将接收到触摸操作时,当前播放的图像的下一帧图像对应的时间戳作为待播放视频片段的起始图像对应的时间戳。又如,在考虑延迟的情况下,终端可以将接收到触摸操作时,当前播放的图像的下N帧图像对应的时间戳作为待播放视频片段的起始图像对应的时间戳;其中,N是大于1的整数,N是预定义的值。Optionally, the terminal determines the time stamp corresponding to the start image of the video clip to be played through the start touch time of the touch operation. For example, without considering the delay, the terminal may use the timestamp corresponding to the next frame of the currently playing image when receiving the touch operation as the timestamp corresponding to the start image of the video clip to be played. For another example, in the case of considering the delay, the terminal may use the timestamp corresponding to the next N frames of images of the currently playing image when receiving the touch operation as the timestamp corresponding to the start image of the video clip to be played; wherein, N is Integer greater than 1, N is a predefined value.
可选的,终端通过触摸操作的触摸时长,确定待播放视频片段的持续时长。后续,终端可以基于待播放视频片段的起始图像对应的时间戳和待播放视频片段的持续时长,确定待播放视频片段的终止图像对应的时间戳。Optionally, the terminal determines the duration of the video clip to be played through the touch duration of the touch operation. Subsequently, the terminal may determine the time stamp corresponding to the end image of the to-be-played video clip based on the time-stamp corresponding to the start image of the to-be-played video clip and the duration of the to-be-played video clip.
在另一种实现方式中,以终端是遥控器、机顶盒(set top box,STB)、智能手机、平台电脑等具有按键的终端为例,终端接收用户针对旋转按键的按压操作。响应于该按压操作,终端确定待播放视频片段的环绕方向、待播放视频片段的起始图像对应的时间戳和终止图像对应的时间戳。类似地,针对同一按键的不同方向的按压操作,指示不同的环绕方向;或者,针对不同按键的按压操作,指示不同的环绕方向。可选的,终端可以通过按压操作的开始按压时间,确定待播放视频片段的起始图像对应的时间戳。可选的,终端可以通过按压操作的按压时长或按压次数,确定待播放视频片段的持续时长。In another implementation manner, taking the terminal as a remote controller, a set top box (STB), a smart phone, a platform computer, etc. with buttons as an example, the terminal receives the user's pressing operation on the rotary button. In response to the pressing operation, the terminal determines the wrapping direction of the video clip to be played, the time stamp corresponding to the start image and the time stamp corresponding to the end image of the video clip to be played. Similarly, for pressing operations of the same key in different directions, different wrapping directions are indicated; or, for pressing operations of different keys, different wrapping directions are indicated. Optionally, the terminal may determine the time stamp corresponding to the start image of the video clip to be played through the start pressing time of the pressing operation. Optionally, the terminal may determine the duration of the to-be-played video clip according to the pressing duration or the number of pressings of the pressing operation.
S205:终端基于待播放视频片段的环绕方向和该起始图像对应的时间戳,从该至少一个目标视频流中确定待播放视频片段所属的视频流(即第一目标视频流)的索引。S205: The terminal determines the index of the video stream (ie the first target video stream) to which the video clip to be played belongs from the at least one target video stream based on the wrapping direction of the video clip to be played and the timestamp corresponding to the start image.
第一目标视频流对应的环绕方向与待播放视频片段的环绕方向相同。第一目标视频流中包含当前播放的视频流中的第一图像的前一帧图像,第一图像对应的时间戳与待播放视频片段的起始图像对应的时间戳相同。The wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played. The first target video stream includes an image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the start image of the video clip to be played.
例如,基于图9,以每个视频流中的图像对应的时间戳依次为1-100为例,假设终端从原始视频流1的第1帧图像开始播放,且在播放到第4帧图像(对应时间戳4)时,接收到第一操作。响应于第一操作,终端确定待播放视频片段的环绕方向是顺时针方向,那么,终端可以基于待播放视频片段的环绕方向是顺时针方向,确定第一目标视频流的类型是第一类视频流。响应于第一操作,在不考虑延迟的情况下,终端确定待播放视频片段的起始图像对应的时间戳是时间戳5。基于此可知,第一图像是原始视频流1中的第5帧图像。第一目标视频流是包含原始视频流1中的第4帧图像的第一类视频流,即第一类视频流3。For example, based on FIG. 9 , taking the time stamps corresponding to the images in each video stream as 1-100 in sequence, it is assumed that the terminal starts to play the first frame of the original video stream 1, and plays to the fourth frame ( When corresponding to time stamp 4), the first operation is received. In response to the first operation, the terminal determines that the wrapping direction of the video clip to be played is the clockwise direction, then the terminal may determine that the type of the first target video stream is the first type of video based on the wrapping direction of the video clip to be played is the clockwise direction. flow. In response to the first operation, without considering the delay, the terminal determines that the timestamp corresponding to the start image of the video clip to be played is timestamp 5 . Based on this, it can be known that the first image is the fifth frame image in the original video stream 1 . The first target video stream is the first type of video stream including the fourth frame image in the original video stream 1 , that is, the first type of video stream 3 .
S206:终端基于该起始图像对应的时间戳和第一目标视频流的索引,确定该起始图像所属的第一分片的索引,并基于该终止图像对应的时间戳和第一目标视频流的索引,确定该起始图像所属的第二分片的索引。S206: The terminal determines, based on the time stamp corresponding to the start image and the index of the first target video stream, the index of the first slice to which the start image belongs, and based on the time stamp corresponding to the end image and the first target video stream to determine the index of the second slice to which the starting image belongs.
例如,终端基于该起始图像对应的时间戳和第一目标视频流的索引,确定该起始图像的索引;并基于该终止图像对应的时间戳和第一目标视频流的索引,确定该终止图像的索引。以及,基于该起始图像的索引,确定该起始图像所属的第一分片的索引;并基于该终止图像的索引,确定该终止图像所属的第二分片的索引。For example, the terminal determines the index of the starting image based on the timestamp corresponding to the starting image and the index of the first target video stream; and determines the termination based on the timestamp corresponding to the ending image and the index of the first target video stream The index of the image. And, based on the index of the starting image, the index of the first slice to which the starting image belongs is determined; and based on the index of the ending image, the index of the second slice to which the ending image belongs is determined.
又如,终端可以直接基于该起始图像对应的时间戳和第一目标视频流的索引,确定该起始图像所属的第一分片的索引。相应的,终端可以直接基于该终止图像对应的时间戳和第一目标视频流的索引,确定该起始图像所属的第二分片的索引。For another example, the terminal may directly determine the index of the first slice to which the starting image belongs based on the timestamp corresponding to the starting image and the index of the first target video stream. Correspondingly, the terminal may directly determine the index of the second slice to which the starting image belongs based on the timestamp corresponding to the termination image and the index of the first target video stream.
基于S205中的示例,以分发侧系统与终端之间基于分片粒度传输视频流为例,由于该起始图像是第一类视频流3中的第5帧图像,该终止图像是第一类视频流3中的第17帧图像,因此,第一分片和第二分片分别是第一类视频流3中的分片1和分片2。Based on the example in S205, taking the transmission of video streams between the distribution-side system and the terminal based on slice granularity as an example, since the start image is the fifth frame image in the first type of video stream 3, the end image is of the first type The 17th frame image in the video stream 3, therefore, the first slice and the second slice are slice 1 and slice 2 in the first type of video stream 3, respectively.
S207:终端向分发侧系统发送请求消息,该请求消息包括第一目标视频流的索引、第一分片的索引和第二分片的索引。S207: The terminal sends a request message to the distribution-side system, where the request message includes the index of the first target video stream, the index of the first segment, and the index of the second segment.
S208:分发侧系统基于该请求消息,查找待播放视频片段。S208: The distribution-side system searches for the video clip to be played based on the request message.
具体的,分发侧系统基于第一目标视频流的索引,从多个原始视频流和至少一个目标视频流中查找到第一目标视频流。然后,基于第一分片的索引,从第一目标视频流中查找第一分片;以及,基于第二分片的索引,从第一目标视频流中查找第二分片。Specifically, the distribution-side system searches for the first target video stream from multiple original video streams and at least one target video stream based on the index of the first target video stream. Then, looking up the first slice from the first target video stream based on the index of the first slice; and looking up the second slice from the first target video stream based on the index of the second slice.
S209:分发侧系统将查找到的待播放视频片段发送给终端。S209: The distribution-side system sends the found video clip to be played to the terminal.
具体的,分发侧系统向终端发送第一分片和第二分片。Specifically, the distribution-side system sends the first fragment and the second fragment to the terminal.
S210:终端解压缩待播放视频片段,并播放解压缩后的待播放视频片段。S210: The terminal decompresses the video clip to be played, and plays the decompressed video clip to be played.
至此,实现了环绕播放。So far, surround playback is achieved.
可选的,环绕播放结束之后,终端可以以环绕播放阶段的最后一帧图像所在的原始视频流中的下一帧图像为起点,继续请求并播放该原始视频流中的该起点之后的图像。Optionally, after the surround playback ends, the terminal may continue to request and play the images after the starting point in the original video stream using the next frame of image in the original video stream where the last frame of the image in the surround playback stage is located as the starting point.
可选的,在执行S210之前,该方法还可以包括:终端按照接收到的待播放视频片段中的图像对应的时间戳,将待播放视频片段拼接到已播放的视频片段之后。在S210中,终端基于拼接后的视频片段,解压缩待播放视频片段。Optionally, before performing S210, the method may further include: the terminal splicing the to-be-played video clips after the played video clips according to the time stamps corresponding to the received images in the to-be-played video clips. In S210, the terminal decompresses the video clip to be played based on the spliced video clip.
如图12所示,为基于S206中的示例提供的分发侧系统与终端之间传输的视频流和终端播放的视频流的示意图。其中,分发侧系统与终端之间传输的视频流依次为:压缩后的原始视频流1中的分片1,压缩后的第一类视频流3中的分片1-2,压缩后的原始视频流4中的分片2、3……As shown in FIG. 12 , it is a schematic diagram of the video stream transmitted between the distribution-side system and the terminal and the video stream played by the terminal provided based on the example in S206 . Among them, the video streams transmitted between the distribution-side system and the terminal are in order: slice 1 in the compressed original video stream 1, slices 1-2 in the compressed first type video stream 3, and compressed original video stream 1. Fragments 2, 3 in video stream 4...
对于终端来说,执行以下步骤:For the terminal, perform the following steps:
终端在接收到压缩后的原始视频流1中的分片1之后,对压缩后的原始视频流1中的第1-4帧图像进行解压缩。After receiving the slice 1 in the compressed original video stream 1, the terminal decompresses the first to fourth frames of images in the compressed original video stream 1.
终端在接收到压缩后的第一类视频流3中的分片1之后,将压缩后的第一类视频流3中的第5-10帧图像拼接到原始视频流1中的第1-4帧图像之后,从而对压缩后的第一类视频流3中的第5-10帧图像进行解压缩。由于编码时,第一类视频流3中的第5帧图像是基于原始视频流1中的第4帧图像进行压缩的,因此,解码时,压缩后的第一类视频流3中的第5帧图 像可以基于原始视频流1中的第4帧图像进行解压缩。After receiving the segment 1 in the compressed first type video stream 3, the terminal splices the 5th to 10th frames in the compressed first type video stream 3 to the 1st to 4th frames in the original video stream 1 After the frame image, the 5th to 10th frame images in the compressed video stream 3 of the first type are decompressed. Since the fifth frame image in the first type video stream 3 is compressed based on the fourth frame image in the original video stream 1 during encoding, during decoding, the fifth frame image in the compressed first type video stream 3 The frame image can be decompressed based on the 4th frame image in the original video stream 1.
终端在接收到压缩后的第一类视频流3中的分片2之后,将压缩后的第一类视频流3中的第11-13帧图像拼接到第一类视频流3中的第5-10帧图像之后,从而对压缩后的第一类视频流3中的第11-13帧图像进行解压缩。由于第一类视频流3中的第11帧图像是一个I帧,可以独立进行解码,因此也可以不进行拼接,直接对压缩后的第一类视频流3中的第11-13帧图像进行解压缩。After receiving the segment 2 in the compressed first-type video stream 3, the terminal splices the 11th to 13th frames of the compressed first-type video stream 3 to the fifth frame in the first-type video stream 3. After -10 frames of images, the 11-13 frames of images in the compressed first-type video stream 3 are decompressed. Since the 11th frame image in the first type video stream 3 is an I frame and can be decoded independently, it is also possible to directly perform the 11-13th frame images in the compressed first type video stream 3 without splicing. unzip.
终端在接收到压缩后的原始视频流4中的分片2之后,将压缩后的原始视频流4中的第14-20帧图像拼接到第一类视频流3中的第11-13帧图像之后,从而对压缩后的原始视频流4中的第14-20帧图像进行解压缩。由于编码时,原始视频流4中的第14帧图像是基于原始视频流4中的第13帧图像进行编码的,而原始视频流4中的第13帧图像与第一类视频流3中的第13帧图像相同,也就是说,原始视频流4中的第14帧图像是基于第一类视频流3中的第13帧图像进行编码的。因此,解码时,压缩后的原始视频流4中的第14帧图像可以基于第一类视频流3中的第13帧图像解压缩得到。After receiving the slice 2 in the compressed original video stream 4, the terminal splices the 14th-20th frame images in the compressed original video stream 4 to the 11th-13th frame images in the first type video stream 3 After that, the 14th to 20th frame images in the compressed original video stream 4 are decompressed. Because during encoding, the 14th frame image in the original video stream 4 is encoded based on the 13th frame image in the original video stream 4, and the 13th frame image in the original video stream 4 is the same as that in the first type of video stream 3. The 13th frame image is the same, that is to say, the 14th frame image in the original video stream 4 is encoded based on the 13th frame image in the first type video stream 3 . Therefore, during decoding, the 14th frame image in the compressed original video stream 4 can be obtained by decompressing the 13th frame image in the first type of video stream 3 .
本申请实施例提供的视频播放方法中,分发侧系统在终端的请求下,从至少一个目标视频流中选择终端当前所需的待播放视频片段,目标视频流是按照设定方向,从每个摄像机采集的原始视频流中选择一定数量帧图像得到视频流。由于终端的待播放视频片段选自第一目标视频流,因此,终端不需要再获取基于不同原始视频流进行压缩的图像。因此,有助于降低视频流的传输码率。In the video playback method provided by the embodiment of the present application, the distribution-side system selects, at the request of the terminal, a video segment to be played currently required by the terminal from at least one target video stream. A certain number of frame images are selected from the original video stream collected by the camera to obtain the video stream. Since the video clip to be played by the terminal is selected from the first target video stream, the terminal does not need to acquire images compressed based on different original video streams. Therefore, it helps to reduce the transmission bit rate of the video stream.
进一步的,由于在编码阶段,编码器对目标视频流进行压缩,可以相当于,一个原始视频流中的图像可以参考另一个原始视频流中的图像进行压缩。而在环绕播放阶段下,分发侧系统向终端提供的视频片段直接从目标视频流中选取得到。因此,终端可以基于一个原始视频流中的图像解码得到另一个原始视频流中的图像。因此,相比传统技术来说,视频播放时传输的视频流中不需要包括很多I帧,从而有助于降低视频流的传输码率。其具体示例如图9和图12所示的示例。Further, since in the encoding stage, the encoder compresses the target video stream, it can be equivalent that an image in one original video stream can be compressed with reference to an image in another original video stream. In the surround playback stage, the video clips provided by the distribution-side system to the terminal are directly selected from the target video stream. Therefore, the terminal can decode an image in one original video stream to obtain an image in another original video stream. Therefore, compared with the conventional technology, the video stream transmitted during video playback does not need to include many I-frames, thereby helping to reduce the transmission bit rate of the video stream. Specific examples thereof are the examples shown in FIGS. 9 and 12 .
另外,由于上述视频编码方法不需要将多个视频流合并成一路高分辨率的画面,因此,摄像机的个数和画面分辨率不受限制。也就是说,使用上述视频编码方法进行视频编码后,在视频播放阶段,可以在保证画面分辨率的情况下,降低视频流的传输码率。In addition, since the above video coding method does not need to combine multiple video streams into one high-resolution picture, the number of cameras and picture resolution are not limited. That is to say, after video encoding is performed by using the above video encoding method, in the video playback stage, the transmission bit rate of the video stream can be reduced under the condition of ensuring the picture resolution.
此外,本技术方案中,确定第一目标视频流的索引、待播放视频片段的起始图像和终止图像所属的分片的索引等均在终端侧执行,分发侧系统不感知,因此,本技术方案对分发侧系统的改进点较小,因此,可以适用于传统的分发侧系统中。In addition, in this technical solution, the determination of the index of the first target video stream, the index of the segment to which the starting image and the ending image of the video clip to be played belong, etc. are all performed on the terminal side, and the system on the distribution side does not perceive it. Therefore, this technology The improvement of the solution to the distribution-side system is small, so it can be applied to the traditional distribution-side system.
可替换的,上述S204-S207也可以由分发侧系统执行。例如,终端将与第一操作相关的信息(如第一操作所针对的旋转按键、第一操作的操作时长等)发送给分发侧系统,并由分发侧系统基于第一操作相关的信息执行该过程。此时,也无需执行上述S202,即终端无需获取视频流的各种索引信息。这样,有助于缓解终端的处理压力。Alternatively, the above S204-S207 may also be performed by the distribution-side system. For example, the terminal sends information related to the first operation (such as the rotary key for the first operation, the operation duration of the first operation, etc.) to the distribution-side system, and the distribution-side system executes the first operation based on the information related to the first operation. process. At this time, it is also unnecessary to perform the above S202, that is, the terminal does not need to acquire various index information of the video stream. In this way, the processing pressure of the terminal is relieved.
在一些实施例中,如果S106替换为:压缩后的多个原始视频流和压缩后的至少一个目标视频流进行封装,则:In some embodiments, if S106 is replaced with: the compressed multiple original video streams and the compressed at least one target video stream are encapsulated, then:
S202中的索引信息还可以包括分片中的分段的索引,如每个分片中的每个分段的索引。The index information in S202 may also include an index of a segment in the slice, such as an index of each segment in each slice.
S206还可以包括:终端在第一分片中确定该起始图像所属的分段的索引,以及在第二分片中确定该终止图像所属的分段的索引。基于S205中的示例,待播放视频片段的起始图 像所属的分段和终止图像所属的分段分别是第一类视频流3中的第5个分段和第17个分段。S206 may further include: the terminal determining the index of the segment to which the starting image belongs in the first slice, and determining the index of the segment to which the termination image belongs in the second slice. Based on the example in S205, the segment to which the start image of the video clip to be played and the segment to which the end image belongs are the 5th segment and the 17th segment in the first-type video stream 3, respectively.
S207中的请求消息还可以包括:该起始图像所属的分段在第一分片中的索引,和该终止图像所属的分段在第二分片中的索引。The request message in S207 may further include: an index in the first slice of the segment to which the start image belongs, and an index in the second slice of the segment to which the end image belongs.
S208还可以包括:分发侧系统基于该起始图像所属的分段的索引从第一分片中查找该起始图像所属的分段,基于该终止图像所属的分段的索引从第二分片中查找该终止图像所属的分段。S208 may further include: the distribution-side system searches for the segment to which the start image belongs from the first segment based on the index of the segment to which the start image belongs, and retrieves the segment from the second segment based on the index of the segment to which the end image belongs. Find the segment to which the termination image belongs.
S209可以包括:分发侧系统向终端发送第一分片中的该起始图像所属的分段,第二分片中的该终止图像所属的分段,以及这两个分段之间的其他分段。S209 may include: the distribution-side system sends to the terminal the segment to which the start image in the first segment belongs, the segment to which the end image in the second segment belongs, and other segments between the two segments part.
如图13所示,为基于S205中的示例并结合分段,提供的分发侧系统与终端之间传输的视频流和终端播放的视频流的示意图。分发侧系统与终端之间传输的视频流依次为:压缩后的原始视频流1中的第1-4帧图像,压缩后的第一类视频流3中的第5-17帧图像,压缩后的原始视频流4中的第18、19……帧图像。As shown in FIG. 13 , it is a schematic diagram of the video stream transmitted between the distribution side system and the terminal and the video stream played by the terminal provided based on the example in S205 and in combination with segmentation. The video streams transmitted between the distribution-side system and the terminal are in sequence: the 1st to 4th frames of the compressed original video stream 1, the 5th to 17th frames of the compressed first-type video stream 3, and the compressed The 18th, 19th... frame images in the original video stream 4 of .
关于拼接解压缩等相关说明可以参考上述对图12的说明,此处不再赘述。For related descriptions such as splicing and decompression, reference may be made to the above description of FIG. 12 , which will not be repeated here.
相比图12的示例,在图13所示的示例中,分发侧系统与终端之间传输的图像较少,也就是说,将一个分片中的图像封装为多个分段,并基于分段粒度传输视频流,有助于降低冗余图像的传输,从而节省传输资源。同时,采用分段粒度传输在终端播放时与采用分片传输一致,均依次播放原始视频流1中的第1-4帧图像,第一类视频流3中的第5-17帧图像,以及原始视频流4中的第18、19……帧图像。Compared with the example of FIG. 12 , in the example shown in FIG. 13 , fewer images are transmitted between the distribution-side system and the terminal, that is, the images in one slice are encapsulated into multiple Transmitting video streams at segment granularity helps reduce the transmission of redundant images, thereby saving transmission resources. At the same time, the use of segmented granularity transmission is consistent with the use of segmented transmission during terminal playback, and the first to fourth frames in the original video stream 1, the 5th to 17th frames in the first type of video stream 3 are played in sequence, and The 18th, 19th... frame images in the original video stream 4.
在另一些实施例中,如果在视频编码阶段,编码器不对压缩后的视频流进行分片处理,而直接对其进行封装,则:In other embodiments, if in the video encoding stage, the encoder does not perform fragmentation processing on the compressed video stream, but directly encapsulates it, then:
S202中的索引信息可以包括:待播放视频内容对应的视频流的索引和视频流中的分段的索引(如每个视频流的索引和每个视频流中的每个分段的索引)。The index information in S202 may include: an index of a video stream corresponding to the to-be-played video content and an index of a segment in the video stream (eg, an index of each video stream and an index of each segment in each video stream).
S206可以替换为:终端基于该起始图像对应的时间戳和第一目标视频流的索引,确定该起始图像所属的第一分段的索引;并基于该终止图像对应的时间戳和第一目标视频流的索引,确定该终止图像所属的第二分段的索引。S206 can be replaced with: the terminal determines the index of the first segment to which the starting image belongs based on the timestamp corresponding to the starting image and the index of the first target video stream; and based on the timestamp corresponding to the ending image and the first target video stream The index of the target video stream, which determines the index of the second segment to which the termination image belongs.
S207中的请求消息可以包括:第一目标视频流的索引、第一分段的索引和第二分段的索引。The request message in S207 may include: the index of the first target video stream, the index of the first segment, and the index of the second segment.
S209具体包括:分发侧系统向终端发送第一分段、第二分段,以及这两个分段之间的分段。S209 specifically includes: the distribution-side system sends the first segment, the second segment, and the segment between the two segments to the terminal.
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The solutions provided by the embodiments of the present application have been introduced above mainly from the perspective of methods. In order to realize the above-mentioned functions, it includes corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or a combination of hardware and computer software with the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对视频编码装置(如编码器)和视频解码装置(如CDN系统或终端)进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既 可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment of the present application, functional modules may be divided into a video encoding device (such as an encoder) and a video decoding device (such as a CDN system or a terminal) according to the foregoing method examples. For example, each functional module may be divided according to each function, or two One or more functions are integrated in one processing module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
如图14所示,图14示出了本申请实施例提供的一种视频编码装置140的结构示意图。该视频编码装置140可以用于执行上文提供的任一种视频编码方法,例如,执行如图8所示的视频编码方法中编码器所执行的步骤。As shown in FIG. 14 , FIG. 14 shows a schematic structural diagram of a video encoding apparatus 140 provided by an embodiment of the present application. The video encoding apparatus 140 can be used to perform any of the video encoding methods provided above, for example, to perform the steps performed by the encoder in the video encoding method shown in FIG. 8 .
示例的,视频编码装置140可以包括:获取单元1401、生成单元1402和压缩单元1403。For example, the video encoding apparatus 140 may include: an acquisition unit 1401 , a generation unit 1402 and a compression unit 1403 .
获取单元1401,用于获取多个原始视频流,其中,该多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的。生成单元1402,用于根据该多个原始视频流,生成至少一个目标视频流,其中,目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。压缩单元1403,用于压缩该至少一个目标视频流。例如,结合图8,获取单元1401可以用于执行S102对应的接收动作,生成单元1402可以用于执行S103,压缩单元1403可以用于执行S105。The acquiring unit 1401 is configured to acquire multiple original video streams, wherein the multiple original video streams are obtained based on the video streams collected by multiple cameras for the same spatial region in the same time period. The generating unit 1402 is configured to generate at least one target video stream according to the multiple original video streams, wherein the target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera according to the set direction . A compression unit 1403, configured to compress the at least one target video stream. For example, with reference to FIG. 8 , the acquiring unit 1401 may be configured to perform the receiving action corresponding to S102 , the generating unit 1402 may be configured to perform S103 , and the compressing unit 1403 may be configured to perform S105 .
可选的,目标视频流中的图像对应的时间戳连续。Optionally, the time stamps corresponding to the images in the target video stream are consecutive.
可选的,多个原始视频流包括第一原始视频流,目标视频流以第一原始视频流中的第一帧图像为起点。Optionally, the multiple original video streams include a first original video stream, and the target video stream takes a first frame image in the first original video stream as a starting point.
可选的,视频编码装置140还包括:发送单元1404和封装单元1405。Optionally, the video encoding apparatus 140 further includes: a sending unit 1404 and an encapsulating unit 1405 .
可选的,生成单元1402还用于,生成该至少一个目标视频流的索引。目标视频流的索引包括:目标视频流对应的摄像机的标识和目标视频流的类别,目标视频流对应的摄像机是目标视频流中的第一帧图像对应的摄像机,视频流的类别用于表征视频流是原始视频流或目标视频流。发送单元1404,用于发送该至少一个目标视频流的索引。例如,结合图8,生成单元1402可以用于执行S106,发送单元1404可以用于执行S107。Optionally, the generating unit 1402 is further configured to generate an index of the at least one target video stream. The index of the target video stream includes: the identification of the camera corresponding to the target video stream and the category of the target video stream, the camera corresponding to the target video stream is the camera corresponding to the first frame image in the target video stream, and the category of the video stream is used to represent the video The stream is the original video stream or the destination video stream. The sending unit 1404 is configured to send the index of the at least one target video stream. For example, with reference to FIG. 8 , the generating unit 1402 can be used to execute S106, and the sending unit 1404 can be used to execute S107.
可选的,设定方向是该多个摄像机的环绕方向,环绕方向包括顺时针方向或逆时针方向。Optionally, the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
可选的,封装单元1405,用于将压缩后的至少一个目标视频流进行封装,得到多个分段。生成单元1402还用于生成该多个分段的索引。发送单元1404,用于发送该多个分段的索引。Optionally, the encapsulation unit 1405 is configured to encapsulate the compressed at least one target video stream to obtain multiple segments. The generating unit 1402 is further configured to generate indexes of the plurality of segments. The sending unit 1404 is configured to send the indices of the multiple segments.
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种视频编码装置140的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。For the specific description of the foregoing optional manners, reference may be made to the foregoing method embodiments, which will not be repeated here. In addition, the explanation of any of the video encoding apparatuses 140 provided above and the description of the beneficial effects may refer to the corresponding method embodiments above, which will not be repeated.
作为示例,结合图7,视频编码装置140中的获取单元1401、生成单元1402、压缩单元1403和封装单元1405中的部分或全部实现的功能可以通过图7中的处理器701执行图7中的存储器702中的程序代码实现。发送单元1404可以通过图7中的通信接口703中的发送单元实现。As an example, with reference to FIG. 7 , some or all of the functions implemented in the acquiring unit 1401 , the generating unit 1402 , the compression unit 1403 , and the encapsulating unit 1405 in the video encoding apparatus 140 may be executed by the processor 701 in FIG. 7 . Program code in memory 702 is implemented. The sending unit 1404 may be implemented by the sending unit in the communication interface 703 in FIG. 7 .
如图15所示,图15示出了本申请实施例提供的一种视频播放装置150的结构示意图。该视频播放装置150可以用于执行上文提供的任一种视频播放方法,例如,执行如图11所示的视频播放方法中终端所执行的步骤。As shown in FIG. 15 , FIG. 15 shows a schematic structural diagram of a video playback apparatus 150 provided by an embodiment of the present application. The video playback apparatus 150 can be used to perform any of the video playback methods provided above, for example, to perform the steps performed by the terminal in the video playback method shown in FIG. 11 .
示例的,视频播放装置150可以包括:接收单元1501、解压缩单元1502和播放 单元1503。For example, the video playback device 150 may include: a receiving unit 1501, a decompressing unit 1502, and a playing unit 1503.
接收单元1501,用于接收待播放视频片段,待播放视频片段选自第一目标视频流。第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。多个摄像机对应的多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的。解压缩单元1502,用于对待播放视频片段进行解压缩。播放单元1503,用于播放解压缩之后的视频片段。例如,结合图11,接收单元150可以用于执行S209对应的接收动作,解压缩单元1502可以用于执行S210中的解压缩步骤,播放单元1503可以用于执行S210中的播放步骤。The receiving unit 1501 is configured to receive a video clip to be played, and the video clip to be played is selected from a first target video stream. The first target video stream is a video stream obtained by selecting a certain number of frame images from original video streams corresponding to each of the plurality of cameras according to a set direction. The multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period. The decompression unit 1502 is configured to decompress the video segment to be played. The playing unit 1503 is used to play the decompressed video segment. For example, referring to FIG. 11 , the receiving unit 150 can be used to perform the receiving action corresponding to S209 , the decompression unit 1502 can be used to perform the decompression step in S210 , and the playing unit 1503 can be used to perform the playing step in S210 .
可选的,第一目标视频流中的图像对应的时间戳连续。Optionally, the time stamps corresponding to the images in the first target video stream are consecutive.
可选的,多个原始视频流包括第一原始视频流,第一目标视频流以第一原始视频流中的第一帧图像为起点。Optionally, the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
可选的,视频播放装置150还包括:发送单元1504和确定单元1505。Optionally, the video playback apparatus 150 further includes: a sending unit 1504 and a determining unit 1505 .
可选的,发送单元1504,用于发送请求消息,请求消息包括第一目标视频流的索引。第一目标视频流的索引用于确定第一目标视频流。其中,第一目标视频流的索引包括第一目标视频流对应的摄像机的标识和第一目标视频流的类别。第一目标视频流对应的摄像机是第一目标视频流中的第一帧图像对应的摄像机。视频流的类别用于表征视频流是原始视频流或目标视频流。例如,结合图11,发送单元1504开可以用于执行S207。Optionally, the sending unit 1504 is configured to send a request message, where the request message includes an index of the first target video stream. The index of the first target video stream is used to determine the first target video stream. The index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream. The camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. For example, in conjunction with FIG. 11 , the sending unit 1504 can be used to perform S207.
可选的,请求消息还包括待播放视频片段所属的目标分段的索引。接收单元1501具体用于:接收目标分段。Optionally, the request message further includes the index of the target segment to which the video segment to be played belongs. The receiving unit 1501 is specifically configured to: receive the target segment.
可选的,确定单元1505,用于确定待播放视频片段的环绕方向和待播放视频片段的起始图像对应的时间戳;以及,基于待播放视频片段的环绕方向和待播放视频片段的起始图像对应的时间戳,确定第一目标视频流的索引。Optionally, the determining unit 1505 is configured to determine the wraparound direction of the video clip to be played and the timestamp corresponding to the start image of the video clip to be played; and, based on the wraparound direction of the video clip to be played and the start of the video clip to be played The timestamp corresponding to the image determines the index of the first target video stream.
可选的,第一目标视频流对应的环绕方向与待播放视频片段的环绕方向相同。第一目标视频流中包含当前播放的视频流中的第一图像的前一帧图像,第一图像对应的时间戳与起始图像对应的时间戳相同。Optionally, the wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the video clip to be played. The first target video stream includes the image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种视频播放装置150的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。For the specific description of the foregoing optional manners, reference may be made to the foregoing method embodiments, which will not be repeated here. In addition, the explanation and description of the beneficial effects of any one of the video playback apparatuses 150 provided above may refer to the above-mentioned corresponding method embodiments, which will not be repeated.
作为示例,结合图7,视频播放装置150中的解压缩单元1502和确定单元1505中的部分或全部实现的功能可以通过图7中的处理器701执行图7中的存储器702中的程序代码实现。接收单元1501可以通过图7中的通信接口703中的接收单元实现。发送单元1504可以通过图7中的通信接口703中的发送单元实现。播放单元1503可以通过显示屏、音频输入输出装置等实现(图7中未示出)。As an example, with reference to FIG. 7 , some or all of the functions implemented in the decompression unit 1502 and the determination unit 1505 in the video playback device 150 may be implemented by the processor 701 in FIG. 7 executing the program code in the memory 702 in FIG. 7 . . The receiving unit 1501 may be implemented by the receiving unit in the communication interface 703 in FIG. 7 . The sending unit 1504 can be implemented by the sending unit in the communication interface 703 in FIG. 7 . The playback unit 1503 can be implemented by a display screen, an audio input and output device, etc. (not shown in FIG. 7 ).
如图16所示,图16示出了本申请实施例提供的一种视频播放装置160的结构示意图。该视频播放装置160可以用于执行上文提供的任一种视频播放方法,例如执行如图11所示的视频播放方法中网络设备所执行的步骤。As shown in FIG. 16 , FIG. 16 shows a schematic structural diagram of a video playback apparatus 160 provided by an embodiment of the present application. The video playback apparatus 160 can be used to perform any of the video playback methods provided above, for example, to perform the steps performed by the network device in the video playback method shown in FIG. 11 .
示例的,视频播放装置160可以包括:确定单元1601和发送单元1602。For example, the video playback apparatus 160 may include: a determining unit 1601 and a sending unit 1602 .
确定单元1601,用于确定待播放视频流片段。其中,待播放视频片段选自第一目 标视频流,第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流。该多个摄像机对应的多个原始视频流是基于该多个摄像机同一时间段针对同一空间区域采集的视频流得到的。发送单元1602,用于向终端发送该待播放视频流片段。例如,结合图11,确定单元1601可以用于执行S208,发送单元1602可以用于执行S209。The determining unit 1601 is configured to determine the video stream segment to be played. Wherein, the video clip to be played is selected from the first target video stream, and the first target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera in the plurality of cameras according to the set direction. The multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period. The sending unit 1602 is configured to send the to-be-played video stream segment to the terminal. For example, with reference to FIG. 11 , the determining unit 1601 can be used to perform S208, and the sending unit 1602 can be used to perform S209.
可选的,第一目标视频流中的图像对应的时间戳连续。Optionally, the time stamps corresponding to the images in the first target video stream are consecutive.
可选的,该多个原始视频流包括第一原始视频流,第一目标视频流以第一原始视频流中的第一帧图像为起点。Optionally, the multiple original video streams include a first original video stream, and the first target video stream takes a first frame image in the first original video stream as a starting point.
可选的,视频播放装置160还可以包括:接收单元1603。接收单元1603,用于接收终端发送的请求消息,该请求消息用于请求待播放视频片段。Optionally, the video playback apparatus 160 may further include: a receiving unit 1603 . The receiving unit 1603 is configured to receive a request message sent by the terminal, where the request message is used to request a video clip to be played.
可选的,该请求消息包括第一目标视频流的索引,第一目标视频流的索引包括第一目标视频流对应的摄像机的标识和第一目标视频流的类别。其中,第一目标视频流对应的摄像机是第一目标视频流中的第一帧图像对应的摄像机。视频流的类别用于表征视频流是原始视频流或目标视频流。确定单元1601还用于,基于第一目标视频流的索引,确定第一目标视频流。Optionally, the request message includes an index of the first target video stream, and the index of the first target video stream includes an identifier of a camera corresponding to the first target video stream and a category of the first target video stream. Wherein, the camera corresponding to the first target video stream is the camera corresponding to the first frame image in the first target video stream. The category of the video stream is used to characterize whether the video stream is the original video stream or the target video stream. The determining unit 1601 is further configured to determine the first target video stream based on the index of the first target video stream.
可选的,发送单元1602还用于,向终端发送至少一个目标视频流的索引。该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。Optionally, the sending unit 1602 is further configured to send the index of at least one target video stream to the terminal. The at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
可选的,该请求消息还包括待播放视频片段所属的目标分段的索引。确定单元1601还用于,基于目标分段的索引,从第一目标视频流中确定目标分段。该情况下,发送单元1602具体用于,向终端发送目标分段。Optionally, the request message further includes an index of the target segment to which the video segment to be played belongs. The determining unit 1601 is further configured to, based on the index of the target segment, determine the target segment from the first target video stream. In this case, the sending unit 1602 is specifically configured to send the target segment to the terminal.
可选的,发送单元1602还用于,向终端发送至少一个目标视频流中的分段的索引。其中,该至少一个目标视频流是根据该多个原始视频流生成的,该至少一个目标视频流包括第一目标视频流。Optionally, the sending unit 1602 is further configured to send the index of the segment in the at least one target video stream to the terminal. Wherein, the at least one target video stream is generated according to the plurality of original video streams, and the at least one target video stream includes a first target video stream.
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种视频播放装置160的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。For the specific description of the foregoing optional manners, reference may be made to the foregoing method embodiments, which will not be repeated here. In addition, for the explanation of any of the above-mentioned video playback apparatuses 160 and the description of the beneficial effects, reference may be made to the above-mentioned corresponding method embodiments, which will not be repeated.
作为示例,结合图7,视频播放装置160中的确定单元1601的部分或全部实现的功能可以通过图7中的处理器701执行图7中的存储器702中的程序代码实现。发送单元1602可以通过图7中的通信接口703中的发送单元实现。接收单元1603可以通过图7中的通信接口703中的接收单元实现。As an example, with reference to FIG. 7 , part or all of the functions implemented by the determining unit 1601 in the video playback device 160 may be implemented by the processor 701 in FIG. 7 executing the program codes in the memory 702 in FIG. 7 . The sending unit 1602 may be implemented by the sending unit in the communication interface 703 in FIG. 7 . The receiving unit 1603 may be implemented by the receiving unit in the communication interface 703 in FIG. 7 .
本申请实施例还提供了一种视频系统,包括网络设备和终端。网络设备,用于:获取多个原始视频流,多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的;并根据多个原始视频流,生成至少一个目标视频流,目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;以及,压缩至少一个目标视频流。终端,用于:接收来自网络设备的待播放视频片段,待播放视频片段选自压缩后的至少一个目标视频流中的第一目标视频流;并对待播放视频片段进行解压缩,再播放解压缩之后的视频片段。The embodiment of the present application also provides a video system, including a network device and a terminal. A network device, used for: acquiring multiple original video streams, which are obtained based on video streams collected by multiple cameras for the same spatial area in the same time period; and generating at least one target video based on the multiple original video streams The target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction; and, compressing at least one target video stream. The terminal is used for: receiving the video clip to be played from the network device, the video clip to be played is selected from the first target video stream in the compressed at least one target video stream; and decompressing the video clip to be played, and then playing the decompressed Video clips after.
其中,这里的网络设备可以是上文中描述的编码器。关于编码器所执行的其他功 能可以参考上文,例如参考上文图8所示的实施例。该终端可以是上文中描述的终端。关于终端所执行的其他功能可以参考上文,例如参考上文图11所示的实施例。Wherein, the network device here may be the encoder described above. Reference may be made to the above with regard to other functions performed by the encoder, for example with reference to the embodiment shown in Figure 8 above. The terminal may be the terminal described above. For other functions performed by the terminal, reference may be made to the above, for example, the embodiment shown in FIG. 11 above.
在一种实现方式中,网络设备还可以用于执行上文图11所示的实施例中分发侧系统所执行的步骤。In an implementation manner, the network device may also be used to perform the steps performed by the distribution-side system in the embodiment shown in FIG. 11 above.
在另一种实现方式中,视频系统还可以包括分发侧系统,其中,该分发侧系统用于与编码器发送的编码结果,并向终端分发待播放视频片段。其具体实现方法可以参考上文,如参考上文图11所示的实施例。In another implementation manner, the video system may further include a distribution-side system, where the distribution-side system is used for sending the encoding result with the encoder, and distributing the to-be-played video segment to the terminal. For the specific implementation method, refer to the above, for example, refer to the embodiment shown in FIG. 11 above.
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上文提供的任一种视频编码方法或视频播放方法。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program runs on a computer, the computer is made to execute any one of the above-mentioned methods. Video encoding method or video playback method.
关于上述提供的任一种故障处理系统和计算机可读存储介质中相关内容的解释及有益效果的描述,均可以参考上述对应的实施例,此处不再赘述。For the explanation and description of the beneficial effects of any fault handling system and computer-readable storage medium provided above, reference may be made to the above-mentioned corresponding embodiments, which will not be repeated here.
本申请实施例还提供了一种芯片。该芯片中集成了用于实现上述视频编码装置140、视频播放装置150或视频播放装置160的功能的控制电路和一个或者多个端口。可选的,该芯片支持的功能可以参考上文,此处不再赘述。本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可通过程序来指令相关的硬件完成。所述的程序可以存储于一种计算机可读存储介质中。上述提到的存储介质可以是只读存储器,随机接入存储器等。上述处理单元或处理器可以是中央处理器,通用处理器、特定集成电路(application specific integrated circuit,ASIC)、微处理器(digital signal processor,DSP),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。The embodiment of the present application also provides a chip. The chip integrates a control circuit and one or more ports for realizing the functions of the video encoding device 140 , the video playing device 150 or the video playing device 160 . Optionally, for the functions supported by the chip, reference may be made to the above, which will not be repeated here. Those of ordinary skill in the art can understand that all or part of the steps for implementing the above embodiments can be completed by instructing relevant hardware through a program. The described program can be stored in a computer-readable storage medium. The above-mentioned storage medium may be a read-only memory, a random access memory, or the like. The above-mentioned processing unit or processor can be a central processing unit, a general-purpose processor, a specific integrated circuit (application specific integrated circuit, ASIC), a microprocessor (digital signal processor, DSP), a field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
本申请实施例还提供了一种包含指令的计算机程序产品,当该指令在计算机上运行时,使得计算机执行上述实施例中的任意一种方法。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如SSD)等。The embodiments of the present application also provide a computer program product containing instructions, when the instructions are run on a computer, the instructions cause the computer to execute any one of the methods in the foregoing embodiments. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions according to the embodiments of the present application are generated in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g. coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center. Computer-readable storage media can be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc., that can be integrated with the media. Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, SSDs), and the like.
应注意,本申请实施例提供的上述用于存储计算机指令或者计算机程序的器件,例如但不限于,上述存储器、计算机可读存储介质和通信芯片等,均具有非易失性(non-transitory)。It should be noted that the above-mentioned devices for storing computer instructions or computer programs provided in the embodiments of the present application, such as but not limited to, the above-mentioned memory, computer-readable storage medium, and communication chip, etc., are all non-transitory (non-transitory) .
在实施所要求保护的本申请过程中,本领域技术人员通过查看附图、公开内容、以及所附权利要求书,可理解并实现公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情 况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。尽管结合具体特征及其实施例对本申请进行了描述,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from review of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that these measures cannot be combined to advantage. Although the application has been described in conjunction with specific features and embodiments thereof, various modifications and combinations may be made without departing from the spirit and scope of the application. Accordingly, this specification and drawings are merely exemplary illustrations of the application as defined by the appended claims, and are deemed to cover any and all modifications, variations, combinations or equivalents within the scope of this application.

Claims (29)

  1. 一种视频编码方法,其特征在于,所述方法包括:A video coding method, characterized in that the method comprises:
    获取多个原始视频流,其中,所述多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的;Acquiring multiple original video streams, wherein the multiple original video streams are obtained based on video streams collected by multiple cameras for the same spatial area in the same time period;
    根据所述多个原始视频流,生成至少一个目标视频流,其中,所述目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;generating at least one target video stream according to the plurality of original video streams, wherein the target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera according to a set direction;
    压缩所述至少一个目标视频流。The at least one target video stream is compressed.
  2. 根据权利要求1所述的方法,其特征在于,所述目标视频流中的图像对应的时间戳连续。The method according to claim 1, wherein the time stamps corresponding to the images in the target video stream are continuous.
  3. 根据权利要求1或2所述的方法,其特征在于,所述多个原始视频流包括第一原始视频流,所述目标视频流以所述第一原始视频流中的第一帧图像为起点。The method according to claim 1 or 2, wherein the multiple original video streams comprise a first original video stream, and the target video stream starts from a first frame image in the first original video stream .
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 3, wherein the method further comprises:
    生成并发送所述至少一个目标视频流的索引;其中,所述目标视频流的索引包括:所述目标视频流对应的摄像机的标识和所述目标视频流的类别,所述目标视频流对应的摄像机是所述目标视频流中的第一帧图像对应的摄像机,视频流的类别用于表征视频流是原始视频流或目标视频流。Generate and send the index of the at least one target video stream; wherein, the index of the target video stream includes: the identifier of the camera corresponding to the target video stream and the category of the target video stream, the target video stream corresponding The camera is the camera corresponding to the first frame image in the target video stream, and the category of the video stream is used to indicate whether the video stream is the original video stream or the target video stream.
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述设定方向是所述多个摄像机的环绕方向,所述环绕方向包括顺时针方向或逆时针方向。The method according to any one of claims 1 to 4, wherein the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 5, wherein the method further comprises:
    将压缩后的所述至少一个目标视频流进行封装,得到多个分段;encapsulating the compressed at least one target video stream to obtain multiple segments;
    生成并发送所述多个分段的索引。An index of the plurality of segments is generated and sent.
  7. 一种视频播放方法,其特征在于,所述方法包括:A video playback method, characterized in that the method comprises:
    接收待播放视频片段,所述待播放视频片段选自第一目标视频流,所述第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;所述多个摄像机对应的多个原始视频流是基于所述多个摄像机同一时间段针对同一空间区域采集的视频流得到的;Receive a video clip to be played, the video clip to be played is selected from a first target video stream, and the first target video stream is to select a certain number of original video streams corresponding to each camera in a plurality of cameras according to a set direction A video stream obtained from a frame image; the multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras for the same spatial area in the same time period;
    对所述待播放视频片段进行解压缩,再播放解压缩之后的视频片段。Decompress the to-be-played video clip, and then play the decompressed video clip.
  8. 根据权利要求7所述的方法,其特征在于,所述第一目标视频流中的图像对应的时间戳连续。The method according to claim 7, wherein the time stamps corresponding to the images in the first target video stream are consecutive.
  9. 根据权利要求7或8所述的方法,其特征在于,所述多个原始视频流包括第一原始视频流,所述第一目标视频流以所述第一原始视频流中的第一帧图像为起点。The method according to claim 7 or 8, wherein the plurality of original video streams comprise a first original video stream, and the first target video stream uses a first frame image in the first original video stream as a starting point.
  10. 根据权利要求7至9任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 7 to 9, wherein the method further comprises:
    发送请求消息,所述请求消息包括所述第一目标视频流的索引,所述第一目标视频流的索引用于确定所述第一目标视频流;其中,所述第一目标视频流的索引包括所述第一目标视频流对应的摄像机的标识和所述第一目标视频流的类别;所述第一目标视频流对应的摄像机是所述第一目标视频流中的第一帧图像对应的摄像机,视频流的类别用于表征视频流是原始视频流或目标视频流。Send a request message, where the request message includes an index of the first target video stream, where the index of the first target video stream is used to determine the first target video stream; wherein, the index of the first target video stream Including the identification of the camera corresponding to the first target video stream and the category of the first target video stream; the camera corresponding to the first target video stream is corresponding to the first frame image in the first target video stream The category of camera, video stream is used to characterize whether the video stream is the original video stream or the target video stream.
  11. 根据权利要求10所述的方法,其特征在于,所述请求消息还包括所述待播放视频片段所属的目标分段的索引;所述接收待播放视频流片段,包括:The method according to claim 10, wherein the request message further includes an index of a target segment to which the video clip to be played belongs; and the receiving the video stream clip to be played comprises:
    接收所述目标分段。The target segment is received.
  12. 根据权利要求7至11任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 7 to 11, wherein the method further comprises:
    确定所述待播放视频片段的环绕方向和所述待播放视频片段的起始图像对应的时间戳;determining the wraparound direction of the to-be-played video clip and the timestamp corresponding to the start image of the to-be-played video clip;
    基于所述待播放视频片段的环绕方向和所述待播放视频片段的起始图像对应的时间戳,确定所述第一目标视频流的索引。The index of the first target video stream is determined based on the wraparound direction of the to-be-played video clip and the timestamp corresponding to the start image of the to-be-played video clip.
  13. 根据权利要求12所述的方法,其特征在于,The method of claim 12, wherein:
    所述第一目标视频流对应的环绕方向与所述待播放视频片段的环绕方向相同;The wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the to-be-played video clip;
    所述第一目标视频流中包含当前播放的视频流中的第一图像的前一帧图像,所述第一图像对应的时间戳与所述起始图像对应的时间戳相同。The first target video stream includes an image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
  14. 一种视频编码装置,其特征在于,所述装置包括:A video encoding device, characterized in that the device comprises:
    获取单元,用于获取多个原始视频流,其中,所述多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的;an acquisition unit, configured to acquire multiple original video streams, wherein the multiple original video streams are obtained based on the video streams collected by multiple cameras in the same space region in the same time period;
    生成单元,用于根据所述多个原始视频流,生成至少一个目标视频流,所述目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;A generating unit, configured to generate at least one target video stream according to the plurality of original video streams, where the target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera according to a set direction ;
    压缩单元,用于压缩所述至少一个目标视频流。a compression unit, configured to compress the at least one target video stream.
  15. 根据权利要求14所述的装置,其特征在于,所述目标视频流中的图像对应的时间戳连续。The apparatus according to claim 14, wherein the time stamps corresponding to the images in the target video stream are consecutive.
  16. 根据权利要求14或15所述的装置,其特征在于,所述多个原始视频流包括第一原始视频流,所述目标视频流以所述第一原始视频流中的第一帧图像为起点。The apparatus according to claim 14 or 15, wherein the plurality of original video streams comprise a first original video stream, and the target video stream takes a first frame image in the first original video stream as a starting point .
  17. 根据权利要求14至16任一项所述的装置,其特征在于,The device according to any one of claims 14 to 16, characterized in that:
    所述生成单元还用于,生成所述至少一个目标视频流的索引;其中,所述目标视频流的索引包括:所述目标视频流对应的摄像机的标识和所述目标视频流的类别,所述目标视频流对应的摄像机是所述目标视频流中的第一帧图像对应的摄像机,视频流的类别用于表征视频流是原始视频流或目标视频流;The generating unit is further configured to generate an index of the at least one target video stream; wherein, the index of the target video stream includes: an identifier of a camera corresponding to the target video stream and a category of the target video stream, and the index of the target video stream includes: The camera corresponding to the target video stream is the camera corresponding to the first frame image in the target video stream, and the category of the video stream is used to characterize whether the video stream is an original video stream or a target video stream;
    所述装置还包括:发送单元,用于发送所述至少一个目标视频流的索引。The apparatus further includes: a sending unit configured to send an index of the at least one target video stream.
  18. 根据权利要求14至17任一项所述的装置,其特征在于,所述设定方向是所述多个摄像机的环绕方向,所述环绕方向包括顺时针方向或逆时针方向。The device according to any one of claims 14 to 17, wherein the set direction is a surrounding direction of the plurality of cameras, and the surrounding direction includes a clockwise direction or a counterclockwise direction.
  19. 根据权利要求14至18任一项所述的装置,其特征在于,所述装置还包括:封装单元和发送单元;The device according to any one of claims 14 to 18, wherein the device further comprises: an encapsulating unit and a sending unit;
    所述封装单元,用于将压缩后的所述至少一个目标视频流进行封装,得到多个分段;the encapsulation unit, configured to encapsulate the compressed at least one target video stream to obtain a plurality of segments;
    所述生成单元还用于,生成所述多个分段的索引;The generating unit is further configured to generate indexes of the multiple segments;
    所述发送单元,用于发送所述多个分段的索引。The sending unit is configured to send the indices of the multiple segments.
  20. 一种视频播放装置,其特征在于,所述装置包括:A video playback device, characterized in that the device comprises:
    接收单元,用于接收待播放视频片段,所述待播放视频片段选自第一目标视频流,所述第一目标视频流是按照设定方向从多个摄像机中的每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;所述多个摄像机对应的多个原始视频流是基于 所述多个摄像机同一时间段针对同一空间区域采集的视频流得到的;A receiving unit, configured to receive a video clip to be played, the video clip to be played is selected from a first target video stream, and the first target video stream is an original video corresponding to each camera in the plurality of cameras according to a set direction A video stream obtained by selecting a certain number of frame images in the stream; the multiple original video streams corresponding to the multiple cameras are obtained based on the video streams collected by the multiple cameras in the same time period for the same spatial area;
    解压缩单元,用于对所述待播放视频片段进行解压缩;a decompression unit, configured to decompress the to-be-played video clip;
    播放单元,用于播放解压缩之后的视频片段。The playback unit is used to play the decompressed video segment.
  21. 根据权利要求20所述的装置,其特征在于,所述第一目标视频流中的图像对应的时间戳连续。The apparatus according to claim 20, wherein the time stamps corresponding to the images in the first target video stream are consecutive.
  22. 根据权利要求20或21所述的装置,其特征在于,所述多个原始视频流包括第一原始视频流,所述第一目标视频流以所述第一原始视频流中的第一帧图像为起点。The apparatus according to claim 20 or 21, wherein the plurality of original video streams comprise a first original video stream, and the first target video stream uses a first frame image in the first original video stream as a starting point.
  23. 根据权利要求20至22任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 20 to 22, wherein the device further comprises:
    发送单元,用于发送请求消息,所述请求消息包括所述第一目标视频流的索引,所述第一目标视频流的索引用于确定所述第一目标视频流;其中,所述第一目标视频流的索引包括所述第一目标视频流对应的摄像机的标识和所述第一目标视频流的类别;所述第一目标视频流对应的摄像机是所述第一目标视频流中的第一帧图像对应的摄像机,视频流的类别用于表征视频流是原始视频流或目标视频流。a sending unit, configured to send a request message, where the request message includes an index of the first target video stream, and the index of the first target video stream is used to determine the first target video stream; wherein the first target video stream is The index of the target video stream includes the identifier of the camera corresponding to the first target video stream and the category of the first target video stream; the camera corresponding to the first target video stream is the first target video stream in the first target video stream. The camera corresponding to a frame of image, and the category of the video stream is used to indicate whether the video stream is the original video stream or the target video stream.
  24. 根据权利要求23所述的装置,其特征在于,所述请求消息还包括所述待播放视频片段所属的目标分段的索引;The apparatus according to claim 23, wherein the request message further comprises an index of the target segment to which the video segment to be played belongs;
    所述接收单元具体用于:接收所述目标分段。The receiving unit is specifically configured to: receive the target segment.
  25. 根据权利要求20至24任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 20 to 24, wherein the device further comprises:
    确定单元,用于确定所述待播放视频片段的环绕方向和所述待播放视频片段的起始图像对应的时间戳;以及,基于所述待播放视频片段的环绕方向和所述待播放视频片段的起始图像对应的时间戳,确定所述第一目标视频流的索引。a determining unit, configured to determine the wrapping direction of the video clip to be played and the time stamp corresponding to the start image of the video clip to be played; and, based on the wrapping direction of the video clip to be played and the video clip to be played The time stamp corresponding to the starting image of the first target video stream is determined.
  26. 根据权利要求25所述的装置,其特征在于,The apparatus of claim 25, wherein:
    所述第一目标视频流对应的环绕方向与所述待播放视频片段的环绕方向相同;The wrapping direction corresponding to the first target video stream is the same as the wrapping direction of the to-be-played video clip;
    所述第一目标视频流中包含当前播放的视频流中的第一图像的前一帧图像,所述第一图像对应的时间戳与所述起始图像对应的时间戳相同。The first target video stream includes an image of the previous frame of the first image in the currently playing video stream, and the timestamp corresponding to the first image is the same as the timestamp corresponding to the starting image.
  27. 一种视频系统,其特征在于,包括:网络设备和终端;A video system, comprising: a network device and a terminal;
    所述网络设备,用于:获取多个原始视频流,所述多个原始视频流是基于多个摄像机同一时间段针对同一空间区域采集的视频流得到的;并根据所述多个原始视频流,生成至少一个目标视频流,所述目标视频流是按照设定方向从每个摄像机对应的原始视频流中选择一定数量帧图像得到的视频流;以及,压缩所述至少一个目标视频流;The network device is configured to: acquire multiple original video streams, the multiple original video streams are obtained based on video streams collected by multiple cameras for the same spatial area in the same time period; and based on the multiple original video streams , generating at least one target video stream, the target video stream is a video stream obtained by selecting a certain number of frame images from the original video stream corresponding to each camera according to the set direction; And, compressing the at least one target video stream;
    所述终端,用于:接收来自所述网络设备的待播放视频片段,所述待播放视频片段选自压缩后的所述至少一个目标视频流中的第一目标视频流;并对所述待播放视频片段进行解压缩,再播放解压缩之后的视频片段。The terminal is configured to: receive a to-be-played video clip from the network device, the to-be-played video clip is selected from a first target video stream in the compressed at least one target video stream; Play the video clip for decompression, and then play the decompressed video clip.
  28. 一种视频编码装置,其特征在于,包括存储器和处理器,所述存储器用于存储计算机程序,所述处理器用于调用所述计算机程序,以执行如权利要求1至6任一项所述的方法。A video encoding device, comprising a memory and a processor, wherein the memory is used to store a computer program, and the processor is used to call the computer program to execute the method according to any one of claims 1 to 6 method.
  29. 一种视频播放装置,其特征在于,包括存储器和处理器,所述存储器用于存储计算机程序,所述处理器用于调用所述计算机程序,以执行如权利要求7至13任一项所述的方法。A video playback device, characterized in that it comprises a memory and a processor, the memory is used to store a computer program, and the processor is used to call the computer program to execute any one of claims 7 to 13. method.
PCT/CN2021/130745 2020-11-16 2021-11-15 Video encoding and video playback method, apparatus and system WO2022100742A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011282060.9 2020-11-16
CN202011282060.9A CN114513669A (en) 2020-11-16 2020-11-16 Video coding and video playing method, device and system

Publications (1)

Publication Number Publication Date
WO2022100742A1 true WO2022100742A1 (en) 2022-05-19

Family

ID=81546846

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130745 WO2022100742A1 (en) 2020-11-16 2021-11-15 Video encoding and video playback method, apparatus and system

Country Status (2)

Country Link
CN (1) CN114513669A (en)
WO (1) WO2022100742A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180007339A1 (en) * 2016-06-30 2018-01-04 Sony Interactive Entertainment Inc. Apparatus and method for capturing and displaying segmented content
CN109996110A (en) * 2017-12-29 2019-07-09 中兴通讯股份有限公司 A kind of video broadcasting method, terminal, server and storage medium
CN110719425A (en) * 2018-07-11 2020-01-21 视联动力信息技术股份有限公司 Video data playing method and device
CN111355966A (en) * 2020-03-05 2020-06-30 上海乐杉信息技术有限公司 Surrounding free visual angle live broadcast method and system
WO2021218573A1 (en) * 2020-04-29 2021-11-04 华为技术有限公司 Video playing method, apparatus and system, and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180007339A1 (en) * 2016-06-30 2018-01-04 Sony Interactive Entertainment Inc. Apparatus and method for capturing and displaying segmented content
CN109996110A (en) * 2017-12-29 2019-07-09 中兴通讯股份有限公司 A kind of video broadcasting method, terminal, server and storage medium
CN110719425A (en) * 2018-07-11 2020-01-21 视联动力信息技术股份有限公司 Video data playing method and device
CN111355966A (en) * 2020-03-05 2020-06-30 上海乐杉信息技术有限公司 Surrounding free visual angle live broadcast method and system
WO2021218573A1 (en) * 2020-04-29 2021-11-04 华为技术有限公司 Video playing method, apparatus and system, and computer storage medium

Also Published As

Publication number Publication date
CN114513669A (en) 2022-05-17

Similar Documents

Publication Publication Date Title
US11470405B2 (en) Network video streaming with trick play based on separate trick play files
US20140359678A1 (en) Device video streaming with trick play based on separate trick play files
WO2018014523A1 (en) Media data acquisition method and apparatus
US11284135B2 (en) Communication apparatus, communication data generation method, and communication data processing method
US10958972B2 (en) Channel change method and apparatus
JP2015136060A (en) Communication device, communication data generation method, and communication data processing method
WO2014193996A2 (en) Network video streaming with trick play based on separate trick play files
WO2023115906A1 (en) Video playing method and related device
US11438645B2 (en) Media information processing method, related device, and computer storage medium
EP1783980A2 (en) Client slide program identifier (PID) translation
WO2021218573A1 (en) Video playing method, apparatus and system, and computer storage medium
CN112752115A (en) Live broadcast data transmission method, device, equipment and medium
KR102247976B1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN108494792A (en) A kind of flash player plays the converting system and its working method of hls video flowings
WO2015123861A1 (en) Method for processing video, terminal and server
KR102176404B1 (en) Communication apparatus, communication data generation method, and communication data processing method
WO2022100742A1 (en) Video encoding and video playback method, apparatus and system
CN104639979A (en) Video sharing method and system
CN112218158B (en) Video processing method and device
WO2022222533A1 (en) Video playing method, apparatus and system, and computer-readable storage medium
WO2023130893A1 (en) Streaming media based transmission method and apparatus, electronic device and computer-readable storage medium
WO2023078048A1 (en) Video bitstream encapsulation method and apparatus, video bitstream decoding method and apparatus, and video bitstream access method and apparatus
CN117793459A (en) Video processing method, device and storage medium
CN113873271A (en) Video stream playing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21891262

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21891262

Country of ref document: EP

Kind code of ref document: A1