WO2021238653A1 - 导播方法、装置及系统 - Google Patents

导播方法、装置及系统 Download PDF

Info

Publication number
WO2021238653A1
WO2021238653A1 PCT/CN2021/093223 CN2021093223W WO2021238653A1 WO 2021238653 A1 WO2021238653 A1 WO 2021238653A1 CN 2021093223 W CN2021093223 W CN 2021093223W WO 2021238653 A1 WO2021238653 A1 WO 2021238653A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame interval
video stream
frame
event
sequence
Prior art date
Application number
PCT/CN2021/093223
Other languages
English (en)
French (fr)
Inventor
陈越
左佳伟
王林芳
姚霆
梅涛
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Priority to US17/999,984 priority Critical patent/US20230209141A1/en
Priority to JP2022573344A priority patent/JP2023527218A/ja
Priority to EP21812681.1A priority patent/EP4145834A4/en
Publication of WO2021238653A1 publication Critical patent/WO2021238653A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream

Definitions

  • the present disclosure relates to the field of computer technology, and in particular to a method, device and system for directing broadcast, and computer storable media.
  • the video streams of each camera are directly transmitted to the video switcher.
  • the director video that meets the live broadcast delay requirements is synthesized according to the instructions of the on-site director.
  • the on-site director needs to combine the live broadcast situation and select the appropriate camera video stream for output.
  • some live broadcast scenes also need to select suitable clips from multiple video streams for playback.
  • a complete live broadcast director team includes cameramen, editors and live directors.
  • Cameramen are distributed in multiple locations on the live broadcast site, and cameras of different formats are used to provide different forms of live images.
  • the work of the cameraman has a certain degree of autonomy, that is, to shoot the live broadcast scene autonomously according to certain principles.
  • the cameraman is also under the control of the on-site choreographer's instructions.
  • the editor is located in the steering vehicle and is responsible for selecting valuable clips from the multiple video streams from the cameraman for playback. Most of the time, one person is required to edit multiple videos.
  • the on-site director is located in the guide vehicle, watching multiple real-time video streams and playback clips provided by the editor, and selecting suitable materials from the middle to generate the guide video. On-site directors also need to direct videographers and editors to obtain effective original video materials and wonderful clips.
  • on-site directors artificially select appropriate video materials in a short time according to the obtained video streams, and synthesize the director videos.
  • a broadcasting guide method including: acquiring a reference video stream from a reference camera position; performing event recognition on the reference video stream to obtain at least one reference event frame interval, and each reference event frame The interval corresponds to a unique event, and each reference event frame interval includes multiple consecutive frame identifications of images where the same event occurs; according to the corresponding relationship between the event and the camera position identification, the local sequence of each reference event frame interval is determined,
  • the partial sequence includes the camera position identification of each frame of the video to be played corresponding to the reference event frame interval and the frame identification corresponding to the camera position identification; the guide sequence is generated according to each partial sequence; and the guide sequence and the corresponding machine are generated according to the guide sequence and the corresponding machine.
  • Bit video stream to generate guide video.
  • the at least one reference event frame interval includes the i-th reference event frame interval, where i is a positive integer, and the local sequence of each reference event frame interval is determined according to the correspondence between the event and the machine location identifier Including: determining the initial partial sequence of the i-th reference event frame interval according to the corresponding relationship between the event and the machine position identifier, and the start frame identifier and the end frame identifier of the initial partial sequence are respectively the start of the i-th reference event frame interval Start frame identification and end frame identification; obtain the video stream from at least one first auxiliary camera position; use the video stream from at least one first auxiliary camera position to extend the initial partial sequence of the i-th reference event frame interval to obtain the i-th A partial sequence of a reference event frame interval.
  • the at least one reference event frame interval further includes an i+1-th reference event frame interval
  • the start frame identifier and the end frame identifier of the i-th reference event frame interval are si and e i , respectively
  • the initial frame identifier of the i+1-th reference event frame interval is s i+1
  • the initial partial sequence of extending the i-th reference event frame interval includes: for the case where i is equal to 1, when there is si and 1 not adjacent , And in the case of at least one of e i and s i+1 that are not adjacent to each other, obtain the video stream between s i and 1 from at least one first auxiliary camera position, and the video stream between e i and s i+1 At least one of the video streams is used as an extended video stream; using the extended video stream, the initial partial sequence of the i-th reference event frame interval is extended to obtain the partial sequence of the i-th reference event frame interval.
  • the at least one reference frame interval further includes the i-1th reference event frame interval, the end frame identifier of the partial sequence of the i-1th reference event frame interval is E i-1 , and the i-th reference event frame interval is extended.
  • the initial partial sequence of a reference event frame interval includes: for the case where i is greater than 1, when there is at least one of si and E i-1 not adjacent, and e i and si+1 not adjacent, Obtain at least one of a video stream between si and E i-1 and a video stream between e i and si+1 from at least one first auxiliary camera position as an extended video stream; using the extension In the video stream, the initial partial sequence of the i-th reference event frame interval is expanded to obtain the partial sequence of the i-th reference event frame interval.
  • the extended video stream is multi-channel, and the multi-channel extended video stream comes from a plurality of first auxiliary camera positions.
  • the initial partial sequence of the i-th reference event frame interval is extended includes: performing facial recognition on each extended video stream. Recognition, obtain at least one face frame interval corresponding to the extended video stream, each face frame interval corresponds to a unique face recognition result, and each face frame interval includes multiple consecutive images with the same face recognition result The frame identification; according to each face frame interval of each extended video stream, at least one extended frame interval is generated, and each extended frame interval includes a plurality of face frame intervals that can be connected in series and correspond to different first auxiliary camera positions According to the at least one extended frame interval, the corresponding extended frame interval with the largest number of first auxiliary positions and the largest total number of frames in the at least one extended frame interval, obtain an extended sequence, the extended sequence includes corresponding to the extended frame interval The camera position identifier of each frame of image of the video to be played and the frame identifier corresponding to the camera position identifier; according to
  • generating at least one extended frame interval according to each face frame interval of each extended video stream includes: for each extended video stream of the first auxiliary camera position, comparing it with the i-th reference event frame The face frame interval adjacent to the interval is determined as the initial extended frame interval; starting from the face frame interval adjacent to the i-th reference event frame interval, along the direction of decreasing or increasing the frame identifier, At least a part of a face frame interval of a first auxiliary camera other than the first auxiliary camera that can be connected in series with the initial extended frame interval is concatenated to the initial extended frame interval to update all The initial extended frame interval; the initial extended frame interval is cyclically updated until there is no longer any other first auxiliary camera position that can be connected in series with the initial extended frame interval, except for the first auxiliary unit corresponding to the initial extended frame interval The face frame interval of the auxiliary camera position; the updated initial extended frame interval is determined as the extended frame interval.
  • the at least one reference event frame interval includes the i-th reference event frame interval and the (i+1)th reference event frame interval, where i is an integer greater than or equal to 1, and the value of the i-th reference event frame interval is
  • the start frame identifier and the end frame identifier are si and e i , respectively, and the start frame identifier of the i+1-th reference event frame interval is si+1 .
  • the each The partial sequence of a reference event frame interval includes: determining the initial partial sequence of the i-th reference event frame interval according to the corresponding relationship between the event and the machine position identifier, and the start frame identifier and the end frame identifier of the initial partial sequence are respectively s i and e i ; in the case that e i and s i+1 are not adjacent, determine the playback type according to the event corresponding to the i-th reference event frame interval; obtain at least one playback corresponding to the playback type Video stream; according to the at least one playback video stream, the initial partial sequence is expanded to obtain the partial sequence of the i-th reference event frame interval.
  • extending the initial partial sequence includes: generating at least one playback sequence according to the at least one playback video stream, and each playback sequence includes a machine image of each frame between e i and si+1. A bit identifier and a frame identifier corresponding to the camera position identifier; using the at least one playback sequence to expand the initial partial sequence.
  • the playback type includes a first playback type
  • generating a playback sequence according to the playback video stream includes: when the playback type is the first playback type, performing an event on the at least one playback video stream Identify and obtain at least one auxiliary event frame interval, where the auxiliary event frame interval includes a plurality of consecutive frame identifiers of images in which an event corresponding to the i-th reference event frame interval occurs; according to the at least one auxiliary event frame Interval, generate at least one playback sequence.
  • generating the at least one playback sequence according to the at least one auxiliary event frame interval includes: sorting the at least one auxiliary event frame interval according to the total number of frames and the weight of each auxiliary event frame interval; Sort the results to generate at least one playback sequence.
  • the playback type includes a first playback type
  • acquiring at least one playback video stream corresponding to the playback type includes: in a case where the playback type is the first playback type, acquiring information from at least one first auxiliary The video stream between si- m and e i +n of the camera position is used as the playback video stream, and both m and n are integers greater than or equal to 0.
  • the playback type includes a second playback type
  • acquiring a playback video stream corresponding to the playback type includes: in a case where the playback type is the second playback type, acquiring s according to the reference video stream 'i, and e' seats in each image corresponding to the angle between i and; angles according to the respective seats, to determine the i-th area section corresponding to the reference event frame event; obtaining at least one region in said second
  • the video stream between si and e i of the auxiliary camera position is used as the playback video stream.
  • the at least one reference event frame interval includes the i-th reference event frame interval and the i+1-th reference event frame interval, where i is an integer greater than or equal to 1, and generating the guide sequence includes: If the end frame identifier E i of the local camera sequence of the reference event frame interval is not adjacent to the start frame identifier S i+1 of the local camera sequence of the i+1th reference event frame interval, a supplementary sequence is generated , the complementary sequence comprises E i and S i, and seats in each image frame identifier between 1 + E i and S i located + seats in each image between the third auxiliary seats 1 ; Merge each partial sequence and the supplementary sequence to obtain the guide sequence.
  • the reference camera position is used to provide close-up video streams of the ball players
  • the first auxiliary camera position is used to provide close-up video streams of different angles on the court
  • the second auxiliary camera position is used to provide close-up video streams of different angles on the court.
  • Standard video stream the third auxiliary camera is used to provide the standard video stream of the viewer's perspective.
  • generating the guide video includes: obtaining each frame image corresponding to the guide sequence according to the guide sequence and the video stream of the corresponding camera position; and encoding each frame image to obtain the guide video.
  • a broadcasting guide device including: an acquisition module configured to acquire a reference video stream from a reference camera position; an event recognition module configured to perform event recognition on the reference video stream to obtain At least one reference event frame interval, each reference event frame interval corresponds to a unique event, and each reference event frame interval includes a plurality of consecutive frame identifications of images that have the same event; the determining module is configured to be based on events and camera positions The corresponding relationship of the identifiers determines the partial sequence of each reference event frame interval.
  • the partial sequence includes the camera position identifier of each frame of the video to be played corresponding to the reference event frame interval and the camera position identifier corresponding to the camera position identifier.
  • Frame identification; the first generation module is configured to generate the guide sequence according to each partial sequence; the second generation module is configured to generate the guide video according to the guide sequence and the video stream of the corresponding camera position.
  • a broadcasting director device including: a memory; and a processor coupled to the memory, the processor being configured to execute any of the foregoing implementations based on instructions stored in the memory The guide method described in the example.
  • a guide system including: the guide device according to any one of the above embodiments; and at least one camera configured to generate a video stream and send the video stream to the guide device .
  • a computer storable medium having computer program instructions stored thereon, and when the instructions are executed by a processor, the directing method described in any of the foregoing embodiments is implemented.
  • FIG. 1 is a flowchart showing a method for directing broadcast according to some embodiments of the present disclosure
  • FIG. 2 is a diagram showing the distribution of live broadcast cameras according to some embodiments of the present disclosure
  • FIG. 3 is a schematic diagram showing a merged reference event frame interval according to some embodiments of the present disclosure
  • FIG. 4 is a flowchart illustrating a partial sequence of determining each reference event frame interval according to some embodiments of the present disclosure
  • Fig. 5a is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to some embodiments of the present disclosure
  • Fig. 5b is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to other embodiments of the present disclosure
  • Fig. 6a is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to some embodiments of the present disclosure
  • Fig. 7 is a flowchart showing a partial sequence of determining each reference event frame interval according to other embodiments of the present disclosure.
  • FIG. 8 is a block diagram showing a broadcasting director device according to some embodiments of the present disclosure.
  • FIG. 9 is a block diagram showing a broadcasting director device according to other embodiments of the present disclosure.
  • FIG. 10 is a block diagram showing a directing system according to some embodiments of the present disclosure.
  • Figure 11 is a block diagram illustrating a computer system for implementing some embodiments of the present disclosure.
  • the present disclosure proposes a directing method, which can reduce labor costs and improve the real-time performance and accuracy of directing.
  • Fig. 1 is a flowchart illustrating a method for directing broadcast according to some embodiments of the present disclosure.
  • Fig. 2 is a diagram showing the distribution of live broadcast camera positions according to some embodiments of the present disclosure.
  • the guide method includes: step S10, acquiring a reference video stream from a reference camera; step S20, performing event recognition on the reference video stream to obtain at least one reference event frame interval; step S30, determining each reference event A partial sequence of the frame interval; step S40, generate a guide sequence according to each partial sequence; step S50, generate a guide video according to the guide sequence and the video stream of the corresponding camera.
  • the directing method is executed by the directing device.
  • the present disclosure obtains the partial sequence of each reference event frame interval through event recognition, and conducts guidance according to each partial sequence, realizes automatic guidance, reduces labor costs, and improves the real-time and accuracy of guidance.
  • the difficulty of the on-site director team is greatly reduced.
  • the on-site director only needs to direct the cameraman to shoot appropriate video materials.
  • the generation and output of the director video is efficiently and automatically completed by the computer.
  • the computer code is customizable, which facilitates the modification and customization of the director logic, and can realize the output of the director video for thousands of people, which greatly enriches the audience's choices.
  • step S10 a reference video stream from a reference camera is obtained.
  • the reference video stream from the reference camera is obtained through the input interface.
  • the reference camera position is the camera CAM-2 as shown in Figure 2.
  • the camera CAM-2 is a 4K camera that provides a close-up video stream of the ball player.
  • the lens of the camera CAM-2 is more than 100 times the lens, which is a close-up position of the grandstand.
  • the camera position of the camera CAM-2 is identified as 2.
  • the cameras in Figure 2 are located on the court.
  • step S20 event recognition is performed on the reference video stream to obtain at least one reference event frame interval.
  • Each reference event frame interval corresponds to a unique event.
  • Each reference event frame interval includes multiple consecutive frame identifiers of images where the same event occurs.
  • video event recognition algorithms are used for event recognition.
  • event recognition algorithms include but are not limited to P3D ResNet (Pseudo-3D Residual Networks, pseudo three-dimensional residual network) algorithm.
  • the event recognition of the reference video stream is realized in the following way.
  • the video event recognition algorithm is used to obtain the event recognition result of each frame of image in the reference video stream.
  • p cls represents the probability of occurrence of an event identified as cls or the probability of no event occurring. For example, in a live football scene, the value of cls is an integer greater than or equal to 1 and less than or equal to 7, respectively representing six different events and no events.
  • events in the live football scene include, but are not limited to, shots, free kicks, corner kicks, goal kicks, boundary kicks, and player conflicts.
  • a smoothing operation is performed on the event recognition result of each frame of image in the reference video stream, and the smoothed event recognition result of each frame of image is obtained.
  • a time window with a length of t seconds is used to perform a smoothing operation on the event recognition result of each frame of image with a step length of 1 frame.
  • t is equal to 0.5.
  • f is the frame rate of the reference video stream.
  • the event or no event corresponding to the maximum probability in the smoothed event recognition result is determined as the final event recognition result of each frame of image.
  • the frame identifiers of multiple consecutive images where the same event occurs are merged to obtain at least one reference event frame interval.
  • multiple reference event frame intervals corresponding to the same event of images with multiple frames of no event may also be merged into one reference event frame interval.
  • FIG. 3 is a schematic diagram illustrating a merged reference event frame interval according to some embodiments of the present disclosure.
  • a, b, and c respectively represent the reference event frame intervals of different events.
  • the reference video stream there are two reference event frame intervals c.
  • the preset threshold is f ⁇ t 1 .
  • t 1 is 0.5 seconds.
  • step S30 is executed.
  • step S30 the partial sequence of each reference event frame interval is determined according to the corresponding relationship between the event and the machine position identifier.
  • the partial sequence includes the camera position identifier of each frame of the to-be-played video corresponding to the reference event frame interval and the frame identifier corresponding to the camera position identifier.
  • At least one reference event frame interval includes the i-th reference event frame interval.
  • i is a positive integer.
  • the partial sequence of the i-th reference event frame interval can be expressed as S i and E i are respectively the start frame identifier and the end frame identifier of the partial sequence of the i-th reference event frame interval.
  • j is the frame identifier of the video to be played, j is greater than or equal to S i and less than or equal to E i .
  • k is the camera position identifier
  • z is the frame identifier of the video stream corresponding to the camera position identifier.
  • c j (k, z) indicates that the j-th frame image of the to-be-played video corresponding to the i-th reference event frame interval is the z-th frame image from the camera position with the camera position identifier k.
  • Table 1 shows the correspondence between events in a live football scene and camera location identifiers.
  • the camera position with the camera position identification 1 is the camera CAM-1 as shown in FIG. 2.
  • the camera CAM-1 is a 4K camera used to provide a standard video stream of the viewer's perspective.
  • the camera CAM-1 provides a standard lens, which is a panoramic camera position for the grandstand.
  • step S30 is implemented through the steps shown in FIG. 4.
  • FIG. 4 is a flowchart illustrating a partial sequence of determining each reference event frame interval according to some embodiments of the present disclosure.
  • determining the partial sequence of each reference event frame interval includes step S31-step S33.
  • step S31 the initial partial sequence of the i-th reference event frame interval is determined according to the corresponding relationship between the event and the machine location identifier.
  • the start frame identifier and the end frame identifier of the initial partial sequence are respectively the start frame identifier and the end frame identifier of the i-th reference event frame interval.
  • the event in the i-th reference event frame interval is a corner kick.
  • the slot corresponding to a corner kick is identified as 2.
  • the start frame identifier and the end frame identifier of the i-th reference event frame interval are si and e i, respectively .
  • the initial local sequence of the i-th reference event frame interval is expressed as c j (2,j) indicates that the j-th frame image of the to-be-played video corresponding to the initial partial sequence is the j-th frame image from the video stream with the camera position identifier 2.
  • step S32 a video stream from at least one first auxiliary camera position is acquired.
  • the first auxiliary camera position is used to provide close-up video streams from different angles on the court
  • the first auxiliary camera position is the camera CAM-3, the camera CAM-7, the camera CAM-8, and the camera CAM- with the camera position identifiers 3, 7, 8, and 10 shown in Figure 2.
  • Camera CAM-3, camera CAM-7, camera CAM-8 and camera CAM-10 are all 4K cameras, which provide 80 times or more lens, 40 times or more lens, 80 times or more lens and 80 times or more lens respectively.
  • Camera CAM-3, camera CAM-7 and camera CAM-10 are all on the ground, and camera CAM-8 is the stand.
  • step S33 the initial partial sequence of the i-th reference event frame interval is expanded by using the video stream from the at least one first auxiliary camera position to obtain the partial sequence of the i-th reference event frame interval.
  • Fig. 5a is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to some embodiments of the present disclosure.
  • the at least one reference event frame interval further includes the (i+1)th reference event frame interval, and the start frame of the (i+1)th reference event frame interval is identified as s i+1 .
  • the initial partial sequence for extending the i-th reference event frame interval includes: step S331-step S332.
  • step S331 for the case where i is equal to 1, if there is at least one of s i and 1 not adjacent to each other, and e i and s i+1 are not adjacent to each other, obtain at least one first auxiliary machine position At least one of the video stream between si and 1 and the video stream between e i and si+1 is used as an extended video stream.
  • non-adjacent means that the difference between si and 1 or the difference between si+1 and e i is greater than the preset difference.
  • the preset difference is 0 or f ⁇ t 2 .
  • t 2 is 2 seconds.
  • step S332 the initial partial sequence of the i-th reference event frame interval is expanded by using the extended video stream to obtain the partial sequence of the i-th reference event frame interval.
  • Fig. 5b is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to other embodiments of the present disclosure.
  • the at least one reference frame interval further includes the i-1th reference event frame interval, and the end frame identifier of the partial sequence of the i-1th reference event frame interval is E i-1 .
  • the initial partial sequence for extending the i-th reference event frame interval includes: step S331'-step S332'.
  • step S331' for the case where i is greater than 1, if there is at least one of si and E i-1 that are not adjacent to each other, and e i and s i+1 are not adjacent to each other, acquiring data from at least one first At least one of the video stream between si and E i-1 and the video stream between e i and si+1 of an auxiliary camera position is used as an extended video stream.
  • non-adjacent means that the difference between si and E i-1 or the difference between si+1 and e i is greater than the preset difference.
  • the preset difference is 0 or f ⁇ t 2 .
  • t 2 is 2 seconds.
  • obtaining the information from e i to si from at least one first auxiliary camera position The video stream between the sum of e i and the preset value is used as an extended video stream.
  • step S332' using the extended video stream, the initial partial sequence of the i-th reference event frame interval is expanded to obtain the partial sequence of the i-th reference event frame interval.
  • Fig. 6a is a flowchart showing an initial partial sequence of extending the i-th reference event frame interval according to some embodiments of the present disclosure.
  • Fig. 6b is a schematic diagram illustrating generating at least one extended frame interval according to some embodiments of the present disclosure.
  • the extended video stream is multiple, and the multiple extended video streams come from the same frame interval of multiple first auxiliary cameras.
  • the initial partial sequence for extending the i-th reference event frame interval includes steps S3321-step S3324.
  • step S3321 face recognition is performed on each extended video stream to obtain at least one face frame interval corresponding to the extended video stream.
  • Each face frame interval corresponds to a unique face recognition result.
  • Each face frame interval includes multiple consecutive frame identifiers of images with the same face recognition result.
  • the total number of frames in each face frame interval is greater than the preset total number of frames.
  • the preset total number of frames is f ⁇ t 2 .
  • t 2 is 2 seconds.
  • the face detection SDK Software Development Kit, software development kit
  • the Neuhub Jingdong artificial intelligence open platform is used to perform face recognition, and obtain the face recognition result of each frame of the extended video stream.
  • at least one face frame interval is obtained according to multiple frame identifiers of consecutive multiple frame images with the same face recognition result.
  • the face recognition result of each frame of image is the attributes of the face included in the frame of image. Facial attributes include but are not limited to coaches, substitute athletes and sideline referees.
  • the extended video stream 1 corresponds to the face frame interval 11 and the face frame interval 12.
  • the face frame interval 11 is [x 1 , x 2 ], and the face frame interval 12 is [x 3 , s i -1].
  • the extended video stream 2 corresponds to the face frame interval 21 and the face frame interval 22.
  • the face frame interval 21 is [x 4 , x 5 ], and the face frame interval 22 is [x 6 , s i -1].
  • the extended video stream 3 corresponds to the face frame interval 31, and the face frame interval 31 is [x 7 ,s i -1].
  • step S3322 at least one extended frame interval is generated according to each face frame interval of each extended video stream.
  • Each extended frame interval includes at least a part of a plurality of face frame intervals corresponding to different first auxiliary camera positions that can be connected in series.
  • the concatenation here means that two face frame intervals are adjacent or have overlapping parts.
  • the following method is used to generate at least one extended frame interval according to each face frame interval of each extended video stream.
  • the face frame interval adjacent to the i-th reference event frame interval is determined as the initial extended frame interval.
  • the face frame interval 12 is determined as the initial extended frame interval.
  • the initial extended frame interval can be connected in series, except for the first auxiliary machine. At least a part of a face frame interval of other first auxiliary camera positions other than the bit is concatenated with the initial extended frame interval to update the initial extended frame interval.
  • a part [x 7 , x 3 -1] of the face frame interval 31 of other first auxiliary camera positions that can be connected in series with the face frame interval 12 is concatenated to the initial extended frame interval to update the initial extended frame interval.
  • the initial extended frame interval is updated cyclically until there is no longer a face frame interval of the first auxiliary camera other than the first auxiliary camera corresponding to the initial extended frame interval that can be connected in series with the initial extended frame interval.
  • the initial extended frame interval can be updated cyclically, and the face frame can be continuously updated.
  • a part of the interval [x 7 ,x 3 -1] of the face frame interval 21 of other first auxiliary camera positions that can be connected in series [x 4 ,x 7 -1] is connected to the initial extended frame interval to realize the Update of the initial extended frame interval [x 3 ,s i -1].
  • the updated initial extended frame interval is determined as the extended frame interval.
  • step S3323 the extended sequence is obtained according to the extended frame interval with the largest number of corresponding first auxiliary machine positions and the largest total frame number in the at least one extended frame interval.
  • the extended sequence includes the camera position identifier of each frame of the video to be played corresponding to the extended frame interval and the frame identifier corresponding to the camera position identifier.
  • the corresponding first One auxiliary camera has the largest number and the largest total number of frames. According to the extended frame interval, an extended sequence is obtained.
  • step S3324 according to the extended sequence, the initial partial sequence of the i-th reference event frame interval is expanded to obtain the partial sequence of the i-th reference event frame interval.
  • the partial sequence of the i-th reference event frame interval obtained by extension is
  • the start identifier of the extended sequence is usually the same as the i-th
  • the end frame identifiers of the event frame interval are frame identifiers separated by a certain number of frames.
  • the sequence between the start frame identifier of the extended sequence and the end frame identifier of the i-th event frame interval is supplemented by the sequence of the corresponding frame image of the third auxiliary camera position.
  • the third auxiliary camera is used to provide a standard video stream from the viewer's perspective.
  • the third auxiliary camera position is the camera CAM-1 of FIG. 2.
  • the camera CAM-1 is a 4K camera with a standard lens and a panoramic stand for the grandstand.
  • both situations exist when si and E i-1 are not adjacent and e i and si+1 are not adjacent, or si and 1 are not adjacent and e i and si+1 are not adjacent.
  • both non-adjacent situations exist, two extended sequences are obtained at the same time, and the initial local sequence is expanded accordingly.
  • step S30 through the steps shown in FIG. 7 to determine the partial sequence of each reference event frame interval.
  • FIG. 7 is a flowchart illustrating a partial sequence of determining each reference event frame interval according to other embodiments of the present disclosure.
  • determining the partial sequence of each reference event frame interval includes step S31'-step S34'.
  • step S31' the initial partial sequence of the i-th reference event frame interval is determined according to the corresponding relationship between the event and the camera position identifier.
  • the start frame identifier and the end frame identifier of the initial partial sequence are si and e i, respectively .
  • the playback type is determined according to the event corresponding to the i-th reference event frame interval.
  • the playback type includes a first playback type and a second playback type.
  • the first playback type is close-up camera slow playback
  • the second playback type is standard camera normal speed playback.
  • the playback type when the event is a player conflict, the playback type is close-up slow playback. In the event that the event is a shot, corner kick or free kick, the playback type is standard camera normal speed playback.
  • step S33' at least one playback video stream corresponding to the playback type is acquired.
  • a video stream between si- m and e i +n from at least one first auxiliary camera position is acquired as the playback video stream.
  • Both m and n are integers greater than or equal to zero.
  • the playback type is a second playback type
  • the region where the event corresponding to the i-th reference event frame interval occurs is determined.
  • the video stream between si and e i of at least one second auxiliary camera located in the area is acquired as the playback video stream.
  • the value range of the camera position angle is [-90,90], and the unit is degrees.
  • the second auxiliary camera position is used to provide standard video streams from different angles on the court.
  • the second auxiliary camera positions are camera CAM-4, camera CAM-5, camera CAM-6, and camera CAM-9 in FIG. 2.
  • Camera CAM-4, camera CAM-5, camera CAM-6 and camera CAM-9 are marked with 4, 5, 6 and 9 respectively, all of which are 4K cameras and provide standard lenses.
  • Camera CAM-4 and camera CAM-6 are offside positions on the left side stand and left side ground positions respectively.
  • Camera CAM-5 and camera CAM-9 are offside stand on the right side and ground stand on the right side respectively.
  • the area where the event occurred is determined in the following way.
  • This one-dimensional linear equation simply describes the change process of the camera position angle of the reference camera position in the i-th event frame interval.
  • the area where the event occurs is in a certain half field area.
  • the camera angle at the beginning of the event shifts to the right half of the field, and the camera angle gradually shifts to the right as the event occurs.
  • k is negative and b is negative, the camera angle at the beginning of the event shifts to the left half of the field, and the camera angle gradually shifts to the left as the event occurs.
  • the half-field is spanned when the event occurs. For events that cross the half-court, it is considered that it does not pose a threat to the goal and will not be replayed.
  • step S34' the initial partial sequence is expanded according to at least one playback video stream to obtain the partial sequence of the i-th reference event frame interval.
  • the initial partial sequence is expanded according to at least one playback video stream in the following manner.
  • Each playback sequence includes a camera position identifier of each frame of image located between e i and si+1 and a frame identifier corresponding to the camera position identifier.
  • auxiliary event frame interval includes a plurality of consecutive frame identifiers of images in which an event corresponding to the i-th reference event frame interval occurs.
  • At least one auxiliary event frame interval is sorted according to the total number of frames and the weight of each auxiliary event frame interval. Furthermore, according to the sorting result, at least one playback sequence is generated.
  • the playback sequence is subjected to frame interpolation processing at a slow motion rate to generate a slow playback sequence.
  • the camera corresponding to the playback sequence is a high-speed camera, there is no need to perform frame interpolation.
  • the playback type when the playback type is the second playback type, after acquiring the video stream between si and e i of at least one second auxiliary camera in the area as the playback video stream, according to at least A playback video stream generates at least one playback sequence.
  • a corresponding playback sequence is generated according to the start frame identifier and the end frame identifier of each playback video stream, and the corresponding camera position identifier.
  • the end frame identification of the initial partial sequence concatenate as many playback sequences as possible to obtain the partial sequence.
  • as many playback sequences as possible can be concatenated continuously to obtain the partial sequence.
  • step S40 is executed.
  • step S40 a guide sequence is generated according to each partial sequence.
  • the end frame identifier E i of the local camera sequence in the i-th reference event frame interval is different from the start frame identifier Si+1 of the local camera sequence in the i+1-th reference event frame interval.
  • a supplementary sequence is generated.
  • the supplementary sequence includes the camera position and frame identifier of each frame of image located between E i and S i+1 , and the camera position of each frame of image located between E i and S i+1 is the third auxiliary camera position. Furthermore, each partial sequence and the supplementary sequence are combined to obtain the guide sequence.
  • a guide video is generated according to the guide sequence and the video stream of the corresponding camera position.
  • each frame image corresponding to the guide sequence is obtained according to the guide sequence and the video stream of the corresponding camera position. Furthermore, each frame of image is coded to obtain a guide video.
  • the video stream is stored in the buffer.
  • the camera position identification of each frame of the image provided by the guide sequence and the frame corresponding to the camera position identification Identification Obtain the image of the corresponding frame identification from the buffered video stream of the corresponding camera position, and encode each frame image in sequence to obtain the guide video.
  • the guide video is output for live broadcast through the video output interface.
  • FIG. 8 is a block diagram showing a broadcasting directing device according to some embodiments of the present disclosure.
  • the broadcasting guide device 8 includes an acquisition module 81, an event recognition module 82, a determination module 83, a first generation module 84 and a second generation module 85.
  • the obtaining module 81 is configured to obtain a reference video stream from a reference camera position, for example, execute step S10 as shown in FIG. 1.
  • the broadcasting guide device 8 further includes an input interface 80.
  • the obtaining module 81 obtains the reference video stream from the reference camera through the input interface 80.
  • the event recognition module 82 is configured to perform event recognition on the reference video stream to obtain at least one reference event frame interval, for example, perform step S20 as shown in FIG. 1.
  • Each reference event frame interval corresponds to a unique event.
  • Each reference event frame interval includes multiple consecutive frame identifiers of images where the same event occurs.
  • the determining module 83 is configured to determine the partial sequence of each reference event frame interval according to the corresponding relationship between the event and the machine position identifier, for example, execute step S30 as shown in FIG. 1.
  • the partial sequence includes the camera position identifier of each frame of the to-be-played video corresponding to the reference event frame interval and the frame identifier corresponding to the camera position identifier.
  • the first generating module 84 is configured to generate a guide sequence according to each partial sequence, for example, execute step S40 as shown in FIG. 1.
  • the second generation module 85 is configured to generate a guide video according to the guide sequence and the video stream of the corresponding camera position, for example, perform step S50 as shown in FIG. 1.
  • the broadcasting director 8 further includes a buffer 86.
  • the buffer 86 is configured to store the video stream of the corresponding camera position.
  • the obtaining module 81 may obtain the video stream of the corresponding camera position through the input interface 80 and buffer it in the buffer 86.
  • the broadcasting guide device 8 further includes an output interface 87.
  • the second generation module 85 outputs the guide video through the output interface 87 for live broadcast.
  • FIG. 9 is a block diagram showing a broadcasting director device according to other embodiments of the present disclosure.
  • the broadcasting guide device 9 includes a memory 91; and a processor 92 coupled to the memory 91.
  • the memory 91 is used to store instructions for executing the corresponding embodiment of the broadcast directing method.
  • the processor 92 is configured to execute the directing method in any of the embodiments of the present disclosure based on instructions stored in the memory 91.
  • Figure 10 is a block diagram illustrating a directing system according to some embodiments of the present disclosure.
  • the broadcasting guiding system 10 includes a broadcasting guiding device 101 and at least one camera 102.
  • the broadcasting guiding device 101 is a broadcasting guiding device in any of the embodiments of the present disclosure.
  • the broadcasting guiding device 101 is configured to execute the broadcasting guiding method in any of the embodiments of the present disclosure.
  • At least one camera 102 is configured to generate a video stream and send the video stream to the broadcasting director.
  • a camera corresponds to a camera position and has a unique camera position identification.
  • the video stream includes but is not limited to the reference video stream and the video stream of the corresponding camera position.
  • Figure 11 is a block diagram illustrating a computer system for implementing some embodiments of the present disclosure.
  • the computer system 110 may be expressed in the form of a general-purpose computing device.
  • the computer system 110 includes a memory 1110, a processor 1120, and a bus 1100 connecting different system components.
  • the memory 1110 may include, for example, a system memory, a non-volatile storage medium, and the like.
  • the system memory for example, stores an operating system, an application program, a boot loader (Boot Loader), and other programs.
  • the system memory may include volatile storage media, such as random access memory (RAM) and/or cache memory.
  • the non-volatile storage medium stores, for example, instructions for executing at least one of the corresponding embodiments of the broadcast directing method.
  • Non-volatile storage media include, but are not limited to, magnetic disk storage, optical storage, flash memory, and the like.
  • the processor 1120 can be implemented in discrete hardware components such as general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, discrete gates or transistors. accomplish.
  • each module such as the judgment module and the determination module, can be implemented by a central processing unit (CPU) running instructions to execute corresponding steps in the memory, or can be implemented by a dedicated circuit that executes the corresponding steps.
  • CPU central processing unit
  • the bus 1100 can use any bus structure among a variety of bus structures.
  • the bus structure includes, but is not limited to, an industry standard architecture (ISA) bus, a microchannel architecture (MCA) bus, and a peripheral component interconnect (PCI) bus.
  • ISA industry standard architecture
  • MCA microchannel architecture
  • PCI peripheral component interconnect
  • the computer system 110 may also include an input/output interface 1130, a network interface 1140, a storage interface 1150, and so on. These interfaces 1130, 1140, 1150, and the memory 1110 and the processor 1120 may be connected through a bus 1100.
  • the input and output interface 1130 can provide a connection interface for input and output devices such as a display, a mouse, and a keyboard.
  • the network interface 1140 provides a connection interface for various networked devices.
  • the storage interface 1150 provides a connection interface for external storage devices such as floppy disks, U disks, and SD cards.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable devices to generate a machine such that the instructions are executed by the processor to be implemented in one or more blocks in the flowcharts and/or block diagrams. The specified function of the device.
  • These computer-readable program instructions can also be stored in a computer-readable memory. These instructions cause the computer to work in a specific manner to produce an article of manufacture, including the realization of the functions specified in one or more blocks in the flowcharts and/or block diagrams. Instructions.
  • the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Biomedical Technology (AREA)
  • Studio Devices (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Studio Circuits (AREA)

Abstract

本公开涉及导播方法、装置及系统、计算机可存储介质,涉及计算机技术领域。导播方法包括:获取来自基准机位的基准视频流;对所述基准视频流进行事件识别,得到至少一个基准事件帧区间,每个基准事件帧区间对应唯一的事件,每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识;根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列,所述局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识;根据各个局部序列,生成导播序列;根据导播序列和相应机位的视频流,生成导播视频。根据本公开,降低了人力成本,提高了导播的实时性和准确性。

Description

导播方法、装置及系统
相关申请的交叉引用
本申请是以CN申请号为202010477406.4,申请日为2020年5月29日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及计算机技术领域,特别涉及导播方法、装置及系统、计算机可存储介质。
背景技术
在电视节目制作中,对于场景固定的节目,往往采用多个机位同时拍摄,再将多机位视频流按照一定的叙事规则进行剪辑和融合,形成多角度、多景别的导播视频,以提高节目的全面性和观赏性。
对于影视剧拍摄,在拍摄完成后有充足的时间进行后期的处理和剪辑。而电视直播场景下,各个机位的视频流被直接传输至视频切换台,在团队协同工作下,按照现场编导的指挥,合成符合直播延时要求的导播视频。在这个过程中,现场编导需要结合直播现场的状况,选择合适的机位的视频流进行输出。另外,一些直播场景也需要从多路视频流中挑选出合适的片段进行回放。
通常情况下,一个完整的直播导播团队包括摄像师、剪辑师和现场编导。
摄像师分布于直播现场的多个位置,采用不同制式的摄像机提供不同形式的现场画面。摄像师的工作具有一定的自主性,即按照一定的原则对直播现场进行自主的拍摄。在一些特定情况下,摄像师还受现场编导的指令控制。剪辑师位于导播车内,负责从来自摄像师的多路视频流中挑选出有价值的片段,供回放使用。多数时候需要一人承担多路视频的剪辑。现场编导位于导播车内,观看多路实时视频流和剪辑师提供的回放片段,从中间挑选出合适的素材生成导播视频。现场编导还需要指挥摄像师和剪辑师,以获得有效的原始视频素材和精彩的剪辑片段。
相关技术中,现场编导根据获取到的视频流,人为地在短时间内挑选合适的视频素材,合成导播视频。
发明内容
根据本公开的第一方面,提供了一种导播方法,包括:获取来自基准机位的基准视频流;对所述基准视频流进行事件识别,得到至少一个基准事件帧区间,每个基准事件帧区间对应唯一的事件,每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识;根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列,所述局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识;根据各个局部序列,生成导播序列;根据导播序列和相应机位的视频流,生成导播视频。
在一些实施例中,所述至少一个基准事件帧区间包括第i个基准事件帧区间,i为正整数,根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列包括:根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列,所述初始局部序列的起始帧标识和结束帧标识分别为第i个基准事件帧区间的起始帧标识和结束帧标识;获取来自至少一个第一辅助机位的视频流;利用来自至少一个第一辅助机位的视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
在一些实施例中,所述至少一个基准事件帧区间还包括第i+1个基准事件帧区间,第i个基准事件帧区间的起始帧标识和结束帧标识分别为s i和e i,第i+1个基准事件帧区间的起始帧标识为s i+1,扩展第i个基准事件帧区间的初始局部序列包括:对于i等于1的情况,在存在s i与1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与1之间视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流;利用所述扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
在一些实施例中,所述至少一个基准帧区间还包括第i-1个基准事件帧区间,第i-1个基准事件帧区间的局部序列的结束帧标识为E i-1,扩展第i个基准事件帧区间的初始局部序列包括:对于i大于1的情况,在存在s i与E i-1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与E i-1之间的视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流;利用所述扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
在一些实施例中,扩展视频流为多路,多路扩展视频流来自多个第一辅助机位,扩展第i个基准事件帧区间的初始局部序列包括:对每路扩展视频流进行人脸识别, 得到与该路扩展视频流对应的至少一个人脸帧区间,每个人脸帧区间对应唯一的人脸识别结果,每个人脸帧区间包括多个连续的、具有相同人脸识别结果的图像的帧标识;根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间,每个扩展帧区间包括多个可串联的、与不同第一辅助机位对应的、人脸帧区间的至少一部分;根据所述至少一个扩展帧区间中的、对应的第一辅助机位的数量最多且总帧数最大的扩展帧区间,得到扩展序列,所述扩展序列包括与该扩展帧区间对应的待播放视频的每帧图像的机位标识和与机位标识对应的帧标识;根据所述扩展序列,扩展所述第i个基准事件帧区间的初始局部序列,得到所述第i个基准事件帧区间的局部序列。
在一些实施例中,根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间包括:对于每个第一辅助机位的扩展视频流,将与所述第i个基准事件帧区间相邻的人脸帧区间,确定为初始扩展帧区间;从与所述第i个基准事件帧区间相邻的人脸帧区间开始,沿着帧标识递减或沿着帧标识递增的方向,将与所述初始扩展帧区间可串联的、除该第一辅助机位以外的其他第一辅助机位的、一个人脸帧区间的至少一部分串接到所述初始扩展帧区间,以更新所述初始扩展帧区间;循环更新所述初始扩展帧区间,直到不再存在与所述初始扩展帧区间可串联的、除与所述初始扩展帧区间对应的第一辅助机位以外的其他第一辅助机位的人脸帧区间;将更新后的初始扩展帧区间,确定为扩展帧区间。
在一些实施例中,所述至少一个基准事件帧区间包括第i个基准事件帧区间和第i+1个基准事件帧区间,i为大于或等于1的整数,第i个基准事件帧区间的起始帧标识和结束帧标识分别为s i和e i,第i+1个基准事件帧区间的起始帧标识为s i+1,根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列包括:根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列,所述初始局部序列的起始帧标识和结束帧标识分别为s i和e i;在e i与s i+1不相邻的情况下,根据与所述第i个基准事件帧区间对应的事件,确定回放类型;获取与所述回放类型对应的至少一路回放视频流;根据所述至少一路回放视频流,扩展所述初始局部序列,得到第i个基准事件帧区间的局部序列。
在一些实施例中,扩展所述初始局部序列包括:根据所述至少一路回放视频流,生成至少一个回放序列,每个回放序列包括位于e i与s i+1之间的每帧图像的机位标识和与机位标识对应的帧标识;利用所述至少一个回放序列,扩展所述初始局部序列。
在一些实施例中,所述回放类型包括第一回放类型,根据所述回放视频流,生成 回放序列包括:在回放类型为第一回放类型的情况下,对所述至少一路回放视频流进行事件识别,得到至少一个辅助事件帧区间,所述辅助事件帧区间包括多个连续的、发生与所述第i个基准事件帧区间对应的事件的图像的帧标识;根据所述至少一个辅助事件帧区间,生成至少一个回放序列。
在一些实施例中,根据所述至少一个辅助事件帧区间,生成至少一个回放序列包括:根据每个辅助事件帧区间的总帧数和权重,对所述至少一个辅助事件帧区间进行排序;根据排序结果,生成至少一个回放序列。
在一些实施例中,所述回放类型包括第一回放类型,获取与所述回放类型对应的至少一路回放视频流包括:在回放类型为第一回放类型的情况下,获取来自至少一个第一辅助机位的s i-m和e i+n之间的视频流,作为所述回放视频流,m和n均为大于或等于0的整数。
在一些实施例中,所述回放类型包括第二回放类型,获取与所述回放类型对应的回放视频流包括:在回放类型为第二回放类型的情况下,根据所述基准视频流,获取s’ i和e’ i之间的与每帧图像对应的机位角度;根据各个机位角度,确定第i个基准事件帧区间对应的事件发生的区域;获取位于所述区域的至少一个第二辅助机位的s i和e i之间的视频流,作为所述回放视频流。
在一些实施例中,所述至少一个基准事件帧区间包括第i个基准事件帧区间和第i+1个基准事件帧区间,i为大于或等于1的整数,生成导播序列包括:在第i个基准事件帧区间的局部机位序列的结束帧标识E i与第i+1个基准事件帧区间的局部机位序列的起始帧标识S i+1不相邻的情况下,生成补充序列,所述补充序列包括位于E i和S i+1之间的每帧图像的机位和帧标识,位于E i和S i+1之间的每帧图像的机位为第三辅助机位;合并各个局部序列和补充序列,得到导播序列。
在一些实施例中,基准机位用于提供带球运动员的特写视频流,第一辅助机位用于提供球场上不同角度的特写视频流,第二辅助机位用于提供球场上不同角度的标准视频流,第三辅助机位用于提供观众视角的标准视频流。
在一些实施例中,生成导播视频包括:根据导播序列和相应机位的视频流,获取与导播序列对应的各帧图像;对各帧图像进行编码,得到导播视频。
根据本公开第二方面,提供了一种导播装置,包括:获取模块,被配置为获取来自基准机位的基准视频流;事件识别模块,被配置为对所述基准视频流进行事件识别,得到至少一个基准事件帧区间,每个基准事件帧区间对应唯一的事件,每个基准事件 帧区间包括多个连续的、发生相同事件的图像的帧标识;确定模块,被配置为根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列,所述局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识;第一生成模块,被配置为根据各个局部序列,生成导播序列;第二生成模块,被配置为根据导播序列和相应机位的视频流,生成导播视频。
根据本公开第三方面,提供了一种导播装置,包括:存储器;以及耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令,执行上述任一实施例所述的导播方法。
根据本公开的第四方面,提供了一种导播系统,包括:上述任一实施例所述的导播装置;和至少一个摄像机,被配置为生成视频流,并将视频流发送到所述导播装置。
根据本公开的第五方面,提供了一种计算机可存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现上述任一实施例所述的导播方法。
附图说明
构成说明书的一部分的附图描述了本公开的实施例,并且连同说明书一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1是示出根据本公开一些实施例的导播方法的流程图;
图2是示出根据本公开一些实施例的直播现场机位的分布图;
图3是示出根据本公开一些实施例的合并基准事件帧区间的示意图;
图4是示出根据本公开一些实施例的确定每个基准事件帧区间的局部序列的流程图;
图5a是示出根据本公开一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图;
图5b是示出根据本公开另一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图;
图6a是示出根据本公开一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图;
图6b是示出根据本公开一些实施例的生成至少一个扩展帧区间的示意图;
图7是示出根据本公开另一些实施例的确定每个基准事件帧区间的局部序列的流 程图;
图8是示出根据本公开一些实施例的导播装置的框图;
图9是示出根据本公开另一些实施例的导播装置的框图;
图10是示出根据本公开一些实施例的导播系统的框图;
图11是示出用于实现本公开一些实施例的计算机系统的框图。
具体实施方式
现在将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它示例可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
相关技术中,人力成本高,实时性和准确性较差。基于此,本公开提出了一种导播方法,可以降低人力成本,提高导播的实时性和准确性。
下面将结合图1和图2详细描述本公开一些实施例的导播方法。
图1是示出根据本公开一些实施例的导播方法的流程图。
图2是示出根据本公开一些实施例的直播现场机位的分布图。
如图1所示,导播方法包括:步骤S10,获取来自基准机位的基准视频流;步骤S20,对基准视频流进行事件识别,得到至少一个基准事件帧区间;步骤S30,确定每个基准事件帧区间的局部序列;步骤S40,根据各个局部序列,生成导播序列;步骤S50,根据导播序列和相应机位的视频流,生成导播视频。例如,导播方法由导播装置执行。
本公开通过事件识别,得到每个基准事件帧区间的局部序列,并根据各个局部序列进行导播,实现了自动导播,降低了人力成本,提高了导播的实时性和准确性。
另外,通过实现自动化导播,大大降低现场导播团队的工作难度,现场编导只需要指挥摄像师拍摄合适的视频素材即可,导播视频的生成和输出由计算机高效、自动地完成。并且,计算机代码具有可定制性,方便对导播逻辑进行修改和定制,能够实现千人千面的导播视频输出,极大地丰富了观众的选择。
在步骤S10中,获取来自基准机位的基准视频流。在一些实施例中,通过输入接口获取来自基准机位的基准视频流。
例如,基准机位为如图2所示的摄像机CAM-2。摄像机CAM-2为4K摄像机,提供带球运动员的特写视频流。摄像机CAM-2的镜头为100倍以上的镜头,为看台特写机位。在一些实施例中,摄像机CAM-2的机位标识为2。例如,图2的各个摄像机位于球场上。
在步骤S20中,对基准视频流进行事件识别,得到至少一个基准事件帧区间。每个基准事件帧区间对应唯一的事件。每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识。
在一些实施例中,利用视频事件识别算法进行事件识别。例如,事件识别算法包括但不限于P3D ResNet(Pseudo-3D Residual Networks,伪三维残差网络)算法。
例如,通过如下方式实现对基准视频流进行事件识别。
首先,利用视频事件识别算法,获得基准视频流中每帧图像的事件识别结果。在一些实施例中,每帧图像的事件识别结果标识为P=[p 1,…,p cls]。p cls表示事件标识为cls的事件发生的概率或者无事件发生的概率。例如,在足球直播场景中,cls的取值为大于或等于1且小于或等于7的整数,分别表示六种不同的事件和无事件。
在一些实施例中,足球直播场景中的事件包括但不限于射门、任意球、角球、球门球、边界球和球员冲突。
其次,对基准视频流中的各帧图像的事件识别结果进行平滑操作,得到每帧图像的平滑后的事件识别结果。例如,使用长度为t秒的时间窗口对各帧图像的事件识别结果进行步长为1帧的平滑操作。在一些实施例中,t等于0.5。通过平滑操作,可以降低事件识别的误差,使得事件识别更加准确,从而提高导播的准确性。
例如,平滑后的事件识别结果表示为
Figure PCTCN2021093223-appb-000001
m=f×t。f为基准视频流的帧率。
Figure PCTCN2021093223-appb-000002
表示第1帧图像与第m帧图像之间的中间帧图像的平滑后的事件识别结果。
然后,对于每帧图像,将与平滑后的事件识别结果中的最大概率对应的事件或者无事件,确定为每帧图像的最终事件识别结果。
最后,合并多个连续的、发生相同事件的图像的帧标识,得到至少一个基准事件帧区间。在一些实施例中,还可以将间隔多帧无事件的图像的多个对应相同事件的基准事件帧区间合并为一个基准事件帧区间。
图3是示出根据本公开一些实施例的合并基准事件帧区间的示意图。
如图3所示,a、b、c分别表示不同事件的基准事件帧区间。对于基准视频流,存在两个基准事件帧区间c。两个基准事件帧区间c之间间隔多帧无事件的图像。例如,在间隔的无事件的图像的帧数小于或等于预设阈值的情况下,合并两个基准事件帧区间c为一个事件帧区间c'。在一些实施例中,预设阈值为f×t 1。例如,t 1为0.5秒。
返回图1,在得到至少一个基准事件帧区间后,执行步骤S30。
在步骤S30中,根据事件与机位标识的对应关系,确定每个基准事件帧区间的局部序列。局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识。
例如,至少一个基准事件帧区间包括第i个基准事件帧区间。i为正整数。在一些实施例中,第i个基准事件帧区间的局部序列可以表示为
Figure PCTCN2021093223-appb-000003
S i和E i分别为第i个基准事件帧区间的局部序列的起始帧标识和结束帧标识。j为待播放视频的帧标识,j大于或等于S i且小于或等于E i。k为机位标识,z为与机位标识对应的视频流的帧标识。c j(k,z)表示与第i个基准事件帧区间对应的待播放视频的第j帧图像为来自于机位标识为k的机位的第z帧图像。
例如,表1示出了足球直播场景中的事件与机位标识的对应关系。
表1事件与机位标识的对应关系表
事件 射门 角球 任意球 球员冲突 球门球 边界球
机位标识 1 2 2 1 2 2
如表1所示,在足球直播场景中,射门、角球、任意球、球员冲突、球门球、边界球分别对应机位标识1、2、2、1、2、2。例如,机位标识为1的机位为如图2所示的摄像机CAM-1。摄像机CAM-1为4K摄像机,用于提供观众视角的标准视频流。摄像机CAM-1提供标准镜头,为看台全景机位。
例如,通过如图4所示的步骤实现步骤S30。
图4是示出根据本公开一些实施例的确定每个基准事件帧区间的局部序列的流程图。
如图4所示,确定每个基准事件帧区间的局部序列包括步骤S31-步骤S33。
在步骤S31中,根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列。初始局部序列的起始帧标识和结束帧标识分别为第i个基准事件帧区间的起始帧标识和结束帧标识。
例如,第i个基准事件帧区间的事件为角球。根据表1,与角球对应的机位标识为2。
在一些实施例中,第i个基准事件帧区间的起始帧标识和结束帧标识分别为s i和e i。第i个基准事件帧区间的初始局部序列表示为
Figure PCTCN2021093223-appb-000004
c j(2,j)表示与初始局部序列对应的待播放视频的第j帧图像为来自于机位标识为2的视频流的第j帧图像。
在步骤S32中,获取来自至少一个第一辅助机位的视频流。例如,第一辅助机位用于提供球场上不同角度的特写视频流
例如,在足球直播场景中,第一辅助机位为图2所示的机位标识分别为3、7、8、10的摄像机CAM-3、摄像机CAM-7、摄像机CAM-8、摄像机CAM-10。摄像机CAM-3、摄像机CAM-7、摄像机CAM-8和摄像机CAM-10均为4K摄像机,分别提供80倍以上镜头、40倍以上镜头、80倍以上镜头和80倍以上镜头。摄像机CAM-3、摄像机CAM-7和摄像机CAM-10均为地面机位,摄像机CAM-8为看台机位。
在步骤S33中,利用来自至少一个第一辅助机位的视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
图5a是示出根据本公开一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图。
在一些实施例中,至少一个基准事件帧区间还包括第i+1个基准事件帧区间,第i+1个基准事件帧区间的起始帧标识为s i+1
如图5a所示,扩展第i个基准事件帧区间的初始局部序列包括:步骤S331-步骤S332。
在步骤S331中,对于i等于1的情况,在存在s i与1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与1之间视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流。例如,这里的不相邻指的是s i与1的差 值或s i+1与e i的差值大于预设差值。在一些实施例中,预设差值为0或f×t 2。例如,t 2为2秒。
在步骤S332中,利用扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
图5b是示出根据本公开另一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图。
在一些实施例中,至少一个基准帧区间还包括第i-1个基准事件帧区间,第i-1个基准事件帧区间的局部序列的结束帧标识为E i-1
如图5b所示,扩展第i个基准事件帧区间的初始局部序列包括:步骤S331'-步骤S332'。
在步骤S331'中,对于i大于1的情况,在存在s i与E i-1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与E i-1之间的视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流。例如,这里的不相邻指的是s i与E i-1的差值或s i+1与e i的差值大于预设差值。在一些实施例中,预设差值为0或f×t 2。例如,t 2为2秒。
在一些实施例中,在e i与s i+1不相邻且s i+1与e i的差值大于预设值的情况下,获取来自至少一个第一辅助机位的从e i到e i与预设值的和之间的视频流作为扩展视频流。
在步骤S332'中,利用扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
下面将结合图6a和图6b详细描述实现扩展第i个基准事件帧区间的初始局部序列的过程。
图6a是示出根据本公开一些实施例的扩展第i个基准事件帧区间的初始局部序列的流程图。
图6b是示出根据本公开一些实施例的生成至少一个扩展帧区间的示意图。
例如,扩展视频流为多路,多路扩展视频流来自多个第一辅助机位的同一帧区间。
如图6a所示,扩展第i个基准事件帧区间的初始局部序列包括步骤S3321-步骤S3324。
在步骤S3321中,对每路扩展视频流进行人脸识别,得到与该路扩展视频流对应的至少一个人脸帧区间。每个人脸帧区间对应唯一的人脸识别结果。每个人脸帧区间包括多个连续的、具有相同人脸识别结果的图像的帧标识。在一些实施例中,每个人 脸帧区间的总帧数大于预设总帧数。例如,预设总帧数为f×t 2。例如,t 2为2秒。通过控制每个人脸帧区间的总帧数,可以提高观众的观看体验。
在一些实施例中,利用Neuhub京东人工智能开放平台提供的人脸检测SDK(Software Development Kit,软件开发工具包),进行人脸识别,得到每路扩展视频流的每帧图像的人脸识别结果。进而,根据连续的、具有相同人脸识别结果的多帧图像的多个帧标识,得到至少一个人脸帧区间。例如,每帧图像的人脸识别结果为该帧图像包括的人脸属性。人脸属性包括但不限于教练、替补运动员和场边裁判。
例如,对于s i与E i-1不相邻或s i与1不相邻的情况,存在如图6b所示的分别来自图2所示的机位标识为3、7、8的第一辅助机位的扩展视频流1、扩展视频流2和扩展视频流3。
各路扩展视频流来自不同的第一辅助机位。扩展视频流1对应人脸帧区间11和人脸帧区间12。人脸帧区间11为[x 1,x 2],人脸帧区间12为[x 3,s i-1]。扩展视频流2对应人脸帧区间21和人脸帧区间22。人脸帧区间21为[x 4,x 5],人脸帧区间22为[x 6,s i-1]。扩展视频流3对应人脸帧区间31,人脸帧区间31为[x 7,s i-1]。x 1<x 4<x 7<x 2<x 5<x 6<x 3<s i-1。
在步骤S3322中,根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间。每个扩展帧区间包括多个可串联的、与不同第一辅助机位对应的、人脸帧区间的至少一部分。这里的可串联指的是两个人脸帧区间相邻或有重叠部分。
例如,通过如下方式实现根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间。
首先,对于每个第一辅助机位的扩展视频流,将与第i个基准事件帧区间相邻的人脸帧区间,确定为初始扩展帧区间。
例如,对于图6b所示的扩展视频流1,将人脸帧区间12,确定为初始扩展帧区间。
其次,从与第i个基准事件帧区间相邻的人脸帧区间开始,沿着帧标识递减或沿着帧标识递增的方向,将与初始扩展帧区间可串联的、除该第一辅助机位以外的其他第一辅助机位的、一个人脸帧区间的至少一部分串接到初始扩展帧区间,以更新初始扩展帧区间。
例如,在s i与E i-1不相邻或s i与1不相邻的情况下,对于图6b所示的扩展视频流1,从人脸帧区间12开始,沿着帧标识递减的方向,将与人脸帧区间12可串联的其 他第一辅助机位的人脸帧区间31的一部分[x 7,x 3-1]串接到初始扩展帧区间,以更新初始扩展帧区间。
然后,循环更新初始扩展帧区间,直到不再存在与初始扩展帧区间可串联的、除与初始扩展帧区间对应的第一辅助机位以外的其他第一辅助机位的人脸帧区间。
例如,在s i与E i-1不相邻或s i与1不相邻的情况下,对于图6b所示的扩展视频流1,循环更新初始扩展帧区间,可继续将与人脸帧区间31的一部分[x 7,x 3-1]可串联的的其他第一辅助机位的人脸帧区间21的一部分[x 4,x 7-1]串接到初始扩展帧区间,实现对初始扩展帧区间[x 3,s i-1]的更新。
最后,将更新后的初始扩展帧区间,确定为扩展帧区间。
例如,在s i与E i-1不相邻或s i与1不相邻的情况下,对于图6b所示的扩展视频流1,可以确定由人脸帧区间21的一部分[x 4,x 7-1]、人脸帧区间31的一部分[x 7,x 3-1]以及人脸帧区间12的全部[x 3,s i-1]串接得到的一个扩展帧区间。
在步骤S3323中,根据至少一个扩展帧区间中的、对应的第一辅助机位的数量最多且总帧数最大的扩展帧区间,得到扩展序列。扩展序列包括与该扩展帧区间对应的待播放视频的每帧图像的机位标识和与机位标识对应的帧标识。
例如,在s i与E i-1不相邻或s i与1不相邻的情况下,对于图6b所示的扩展视频流1,由人脸帧区间21的一部分[x 4,x 7-1]、人脸帧区间31的一部分[x 7,x 3-1]以及人脸帧区间12的全部[x 3,s i-1]串接得到的一个扩展帧区间中、对应的第一辅助机位的数量最多且总帧数最大。根据该扩展帧区间,得到扩展序列。
例如,得到的扩展序列为
Figure PCTCN2021093223-appb-000005
Figure PCTCN2021093223-appb-000006
在步骤S3324中,根据扩展序列,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
例如,通过扩展得到的第i个基准事件帧区间的局部序列为
Figure PCTCN2021093223-appb-000007
例如,对于e i与s i+1不相邻的情况,还可以沿着帧标识递增的方向,进行扩展,得到扩展序列。该扩展序列用于在
Figure PCTCN2021093223-appb-000008
后,对初始局部序列进行扩展。
在一些实施例中,对于e i与s i+1不相邻的情况,考虑到要预留给摄像师一定的反应时间,此种情况下的扩展序列的起始标识通常为与第i个事件帧区间的结束帧标识相 隔一定帧数量的帧标识。而扩展序列的起始帧标识与第i个事件帧区间的结束帧标识之间的序列采用第三辅助机位的相应的帧图像的序列进行补充。
例如,第三辅助机位用于提供观众视角的标准视频流。在一些实施例中,在足球直播场景中,第三辅助机位为图2的摄像机CAM-1。摄像机CAM-1为4K摄像机,提供标准镜头,为看台全景机位。
在一些实施例中,在s i与E i-1不相邻和e i与s i+1不相邻两种情况均存在,或者s i与1不相邻和e i与s i+1不相邻两种情况均存在的情况下,同时得到两个扩展序列,同时对初始局部序列进行相应地扩展。
例如,还可以通过如图7所示的步骤实现步骤S30确定每个基准事件帧区间的局部序列。
图7是示出根据本公开另一些实施例的确定每个基准事件帧区间的局部序列的流程图。
如图7所示,确定每个基准事件帧区间的局部序列包括步骤S31'-步骤S34'。
在步骤S31'中,根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列。初始局部序列的起始帧标识和结束帧标识分别为s i和e i
在步骤S32'中,在e i与s i+1不相邻的情况下,根据与第i个基准事件帧区间对应的事件,确定回放类型。例如,在足球直播场景中,回放类型包括第一回放类型和第二回放类型。在一些实施例中,第一回放类型为特写机位慢速回放,第二回放类型为标准机位常速回放。
例如,在事件为球员冲突的情况下,回放类型为特写机位慢速回放。在事件为射门、角球或任意球的情况下,回放类型为标准机位常速回放。
在步骤S33'中,获取与回放类型对应的至少一路回放视频流。
例如,在回放类型为第一回放类型的情况下,获取来自至少一个第一辅助机位的s i-m和e i+n之间的视频流,作为回放视频流。m和n均为大于或等于0的整数。通常情况下,由于特写机位存在遮挡等情况,不能保证时间发生时,所有特写机位均能拍到相同事件的画面,因此会相对于第i个事件帧区间的起始帧标识和结束帧标识左右各增加一定范围,得到回放视频流。
例如,在回放类型为第二回放类型的情况下,根据基准视频流,获取s’ i和e’ i之间的与每帧图像对应的机位角度。进而,根据各个机位角度,确定第i个基准事件帧区间对应的事件发生的区域。从而,获取位于该区域的至少一个第二辅助机位的s i和e i之 间的视频流,作为回放视频流。例如,机位角度的取值范围为[-90,90],单位为度。
例如,第二辅助机位用于提供球场上不同角度的标准视频流。在一些实施例中,第二辅助机位为图2中的摄像机CAM-4、摄像机CAM-5、摄像机CAM-6和摄像机CAM-9。摄像机CAM-4、摄像机CAM-5、摄像机CAM-6和摄像机CAM-9的机位标识分别为4、5、6和9,均为4K摄像机,提供标准镜头。摄像机CAM-4和摄像机CAM-6分别为左侧看台越位机位和左侧地面机位。摄像机CAM-5和摄像机CAM-9分别为右侧看台越位机位和右侧地面机位。
例如,通过如下方式实现确定事件发生的区域。
首先,根据各个机位角度,生成机位角度序列
Figure PCTCN2021093223-appb-000009
然后,计算机位角度序列A的一元一次线型回归方程:a=k×x+b。其中a为角度,x为角度序列A的索引值x∈[0,e i-s i),x∈N。该一元一次方程简单描述了第i个事件帧区间内,基准机位的机位角度的变化过程。
例如,在k×b大于0(k和b的正负性相同)的情况下,事件发生的区域在某半场区域。在k为正且b为正的情况下,事件开始时机位角度偏向右半场区域,随着事件的发生机位角度逐渐向右偏移。在k为负且b为负的情况下,事件开始时机位角度偏向左半场区域,随着事件的发生机位角度逐渐向左偏移。
在k×b小于0(k和b的正负性不同)的情况下,事件发生时跨越了半场。对于跨越半场的事件,认为其对球门没有产生威胁,不进行回放。
在步骤S34'中,根据至少一路回放视频流,扩展初始局部序列,得到第i个基准事件帧区间的局部序列。
例如,通过如下方式实现根据至少一路回放视频流,扩展初始局部序列。
首先,根据至少一路回放视频流,生成至少一个回放序列。每个回放序列包括位于e i与s i+1之间的每帧图像的机位标识和与机位标识对应的帧标识。
例如,在回放类型为第一回放类型的情况下,对至少一路回放视频流进行事件识别,得到至少一个辅助事件帧区间。进而,根据至少一个辅助事件帧区间,生成至少一个回放序列。辅助事件帧区间包括多个连续的、发生与所述第i个基准事件帧区间对应的事件的图像的帧标识。
例如,根据每个辅助事件帧区间的总帧数和权重,对至少一个辅助事件帧区间进行排序。进而,根据排序结果,生成至少一个回放序列。在一些实施例中,在第一回放类型为特写机位慢速回放的情况下,对回放序列按照慢动作速率进行插帧处理,生 成慢速回放序列。在回放序列对应的机位为高速相机的情况下,无需进行插帧处理。
在一些实施例中,在回放类型为第二回放类型的情况下,在获取位于该区域的至少一个第二辅助机位的s i和e i之间的视频流作为回放视频流后,根据至少一个回放视频流,生成至少一个回放序列。例如,根据每个回放视频流的起始帧标识和结束帧标识、以及其对应的机位标识,生成相应的回放序列。
然后,在生成至少一个回放序列后,利用至少一个回放序列,扩展初始局部序列。
例如,在初始局部序列的结束帧标识之后,串接尽可能多的回放序列,得到局部序列。在一些实施例中,还可以在利用扩展序列扩展初始局部序列以后,继续串接尽可能多的回放序列,得到局部序列。
返回图1,在确定每个基准事件帧区间的局部序列后,执行步骤S40。
在步骤S40中,根据各个局部序列,生成导播序列。
例如,合并各个局部序列,得到导播序列。
在一些实施例中,在第i个基准事件帧区间的局部机位序列的结束帧标识E i与第i+1个基准事件帧区间的局部机位序列的起始帧标识S i+1不相邻的情况下,生成补充序列。补充序列包括位于E i和S i+1之间的每帧图像的机位和帧标识,位于E i和S i+1之间的每帧图像的机位为第三辅助机位。进而,合并各个局部序列和补充序列,得到导播序列。
在步骤S50中,根据导播序列和相应机位的视频流,生成导播视频。在一些实施例中,根据导播序列和相应机位的视频流,获取与导播序列对应的各帧图像。进而,对各帧图像进行编码,得到导播视频。
例如,通过视频输入接口获取到来自各摄像机的视频流后,将视频流存储到缓存中,在得到导播序列后,根据导播序列提供的每帧图像的机位标识和与机位标识对应的帧标识,从缓存的相应机位的视频流中获取相应帧标识的图像,将各帧图像按顺序编码,得到导播视频。
在一些实施例中,通过视频输出接口将导播视频输出进行直播。
图8是示出根据本公开一些实施例的导播装置的框图。
如图8所示,导播装置8包括获取模块81、事件识别模块82、确定模块83、第一生成模块84和第二生成模块85。
获取模块81被配置为获取来自基准机位的基准视频流,例如执行如图1所示的步骤S10。
在一些实施例中,导播装置8还包括输入接口80。获取模块81通过输入接口80获取来自基准机位的基准视频流。
事件识别模块82被配置为对基准视频流进行事件识别,得到至少一个基准事件帧区间,例如执行如图1所示的步骤S20。每个基准事件帧区间对应唯一的事件。每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识。
确定模块83被配置为根据事件与机位标识的对应关系,确定每个基准事件帧区间的局部序列,例如执行如图1所示的步骤S30。局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识。
第一生成模块84被配置为根据各个局部序列,生成导播序列,例如执行如图1所示的步骤S40。
第二生成模块85被配置为根据导播序列和相应机位的视频流,生成导播视频,例如执行如图1所示的步骤S50。
在一些实施例中,导播装置8还包括缓存86。缓存86被配置为存储相应机位的视频流。例如,获取模块81可以通过输入接口80获取相应机位的视频流,并缓存在缓存86中。
在一些实施例中,导播装置8还包括输出接口87。第二生成模块85通过输出接口87将导播视频输出,用于直播。
图9是示出根据本公开另一些实施例的导播装置的框图。
如图9所示,导播装置9包括存储器91;以及耦接至该存储器91的处理器92。存储器91用于存储执行导播方法对应实施例的指令。处理器92被配置为基于存储在存储器91中的指令,执行本公开中任意一些实施例中的导播方法。
图10是示出根据本公开一些实施例的导播系统的框图。
如图10所示,导播系统10包括导播装置101和至少一个摄像机102。导播装置101为本公开任意一些实施例中的导播装置。导播装置101被配置为执行本公开任意一些实施例中的导播方法。
至少一个摄像机102被配置为生成视频流,并将视频流发送到导播装置。一个摄像机对应一个机位,具有唯一的机位标识。视频流包括但不限于基准视频流和相应机位的视频流。
图11是示出用于实现本公开一些实施例的计算机系统的框图。
如图11所示,计算机系统110可以通用计算设备的形式表现。计算机系统110 包括存储器1110、处理器1120和连接不同系统组件的总线1100。
存储器1110例如可以包括系统存储器、非易失性存储介质等。系统存储器例如存储有操作系统、应用程序、引导装载程序(Boot Loader)以及其他程序等。系统存储器可以包括易失性存储介质,例如随机存取存储器(RAM)和/或高速缓存存储器。非易失性存储介质例如存储有执行导播方法中的至少一种的对应实施例的指令。非易失性存储介质包括但不限于磁盘存储器、光学存储器、闪存等。
处理器1120可以用通用处理器、数字信号处理器(DSP)、应用专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑设备、分立门或晶体管等分立硬件组件方式来实现。相应地,诸如判断模块和确定模块的每个模块,可以通过中央处理器(CPU)运行存储器中执行相应步骤的指令来实现,也可以通过执行相应步骤的专用电路来实现。
总线1100可以使用多种总线结构中的任意总线结构。例如,总线结构包括但不限于工业标准体系结构(ISA)总线、微通道体系结构(MCA)总线、外围组件互连(PCI)总线。
计算机系统110还可以包括输入输出接口1130、网络接口1140、存储接口1150等。这些接口1130、1140、1150以及存储器1110和处理器1120之间可以通过总线1100连接。输入输出接口1130可以为显示器、鼠标、键盘等输入输出设备提供连接接口。网络接口1140为各种联网设备提供连接接口。存储接口1150为软盘、U盘、SD卡等外部存储设备提供连接接口。
这里,参照根据本公开实施例的方法、装置和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个框以及各框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可提供到通用计算机、专用计算机或其他可编程装置的处理器,以产生一个机器,使得通过处理器执行指令产生实现在流程图和/或框图中一个或多个框中指定的功能的装置。
这些计算机可读程序指令也可存储在计算机可读存储器中,这些指令使得计算机以特定方式工作,从而产生一个制造品,包括实现在流程图和/或框图中一个或多个框中指定的功能的指令。
本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。
通过上述实施例中的导播方法、装置及系统、计算机可存储介质,降低了人力成本,提高了导播的实时性和准确性。
至此,已经详细描述了根据本公开的导播方法、装置及系统、计算机可存储介质。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。

Claims (19)

  1. 一种导播方法,包括:
    获取来自基准机位的基准视频流;
    对所述基准视频流进行事件识别,得到至少一个基准事件帧区间,每个基准事件帧区间对应唯一的事件,每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识;
    根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列,所述局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识;
    根据各个局部序列,生成导播序列;
    根据导播序列和相应机位的视频流,生成导播视频。
  2. 根据权利要求1所述的导播方法,其中,所述至少一个基准事件帧区间包括第i个基准事件帧区间,i为正整数,根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列包括:
    根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列,所述初始局部序列的起始帧标识和结束帧标识分别为第i个基准事件帧区间的起始帧标识和结束帧标识;
    获取来自至少一个第一辅助机位的视频流;
    利用来自至少一个第一辅助机位的视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
  3. 根据权利要求2所述的导播方法,其中,所述至少一个基准事件帧区间还包括第i+1个基准事件帧区间,第i个基准事件帧区间的起始帧标识和结束帧标识分别为s i和e i,第i+1个基准事件帧区间的起始帧标识为s i+1,扩展第i个基准事件帧区间的初始局部序列包括:
    对于i等于1的情况,在存在s i与1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与1之间视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流;
    利用所述扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
  4. 根据权利要求2所述的导播方法,其中,所述至少一个基准帧区间还包括第i-1个基准事件帧区间,第i-1个基准事件帧区间的局部序列的结束帧标识为E i-1,扩展第i个基准事件帧区间的初始局部序列包括:
    对于i大于1的情况,在存在s i与E i-1不相邻、和e i与s i+1不相邻的至少一种的情况下,获取来自至少一个第一辅助机位的s i与E i-1之间的视频流、和e i与s i+1之间的视频流的至少一种,作为扩展视频流;
    利用所述扩展视频流,扩展第i个基准事件帧区间的初始局部序列,得到第i个基准事件帧区间的局部序列。
  5. 根据权利要求3或4所述的导播方法,其中,扩展视频流为多路,多路扩展视频流来自多个第一辅助机位,扩展第i个基准事件帧区间的初始局部序列包括:
    对每路扩展视频流进行人脸识别,得到与该路扩展视频流对应的至少一个人脸帧区间,每个人脸帧区间对应唯一的人脸识别结果,每个人脸帧区间包括多个连续的、具有相同人脸识别结果的图像的帧标识;
    根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间,每个扩展帧区间包括多个可串联的、与不同第一辅助机位对应的、人脸帧区间的至少一部分;
    根据所述至少一个扩展帧区间中的、对应的第一辅助机位的数量最多且总帧数最大的扩展帧区间,得到扩展序列,所述扩展序列包括与该扩展帧区间对应的待播放视频的每帧图像的机位标识和与机位标识对应的帧标识;
    根据所述扩展序列,扩展所述第i个基准事件帧区间的初始局部序列,得到所述第i个基准事件帧区间的局部序列。
  6. 根据权利要求4所述的导播方法,其中,根据各路扩展视频流的各个人脸帧区间,生成至少一个扩展帧区间包括:
    对于每个第一辅助机位的扩展视频流,将与所述第i个基准事件帧区间相邻的人脸帧区间,确定为初始扩展帧区间;
    从与所述第i个基准事件帧区间相邻的人脸帧区间开始,沿着帧标识递减或沿着 帧标识递增的方向,将与所述初始扩展帧区间可串联的、除该第一辅助机位以外的其他第一辅助机位的、一个人脸帧区间的至少一部分串接到所述初始扩展帧区间,以更新所述初始扩展帧区间;
    循环更新所述初始扩展帧区间,直到不再存在与所述初始扩展帧区间可串联的、除与所述初始扩展帧区间对应的第一辅助机位以外的其他第一辅助机位的人脸帧区间;
    将更新后的初始扩展帧区间,确定为扩展帧区间。
  7. 根据权利要求1所述的导播方法,其中,所述至少一个基准事件帧区间包括第i个基准事件帧区间和第i+1个基准事件帧区间,i为大于或等于1的整数,第i个基准事件帧区间的起始帧标识和结束帧标识分别为s i和e i,第i+1个基准事件帧区间的起始帧标识为s i+1,根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列包括:
    根据事件与机位标识的对应关系,确定第i个基准事件帧区间的初始局部序列,所述初始局部序列的起始帧标识和结束帧标识分别为s i和e i
    在e i与s i+1不相邻的情况下,根据与所述第i个基准事件帧区间对应的事件,确定回放类型;
    获取与所述回放类型对应的至少一路回放视频流;
    根据所述至少一路回放视频流,扩展所述初始局部序列,得到第i个基准事件帧区间的局部序列。
  8. 根据权利要求7所述的导播方法,其中,扩展所述初始局部序列包括:
    根据所述至少一路回放视频流,生成至少一个回放序列,每个回放序列包括位于e i与s i+1之间的每帧图像的机位标识和与机位标识对应的帧标识;
    利用所述至少一个回放序列,扩展所述初始局部序列。
  9. 根据权利要求8所述的导播方法,其中,所述回放类型包括第一回放类型,根据所述回放视频流,生成回放序列包括:
    在回放类型为第一回放类型的情况下,对所述至少一路回放视频流进行事件识别,得到至少一个辅助事件帧区间,所述辅助事件帧区间包括多个连续的、发生与所述第 i个基准事件帧区间对应的事件的图像的帧标识;
    根据所述至少一个辅助事件帧区间,生成至少一个回放序列。
  10. 根据权利要求9所述的导播方法,其中,根据所述至少一个辅助事件帧区间,生成至少一个回放序列包括:
    根据每个辅助事件帧区间的总帧数和权重,对所述至少一个辅助事件帧区间进行排序;
    根据排序结果,生成至少一个回放序列。
  11. 根据权利要求7所述的导播方法,其中,所述回放类型包括第一回放类型,获取与所述回放类型对应的至少一路回放视频流包括:
    在回放类型为第一回放类型的情况下,获取来自至少一个第一辅助机位的s i-m和e i+n之间的视频流,作为所述回放视频流,m和n均为大于或等于0的整数。
  12. 根据权利要求7所述的导播方法,其中,所述回放类型包括第二回放类型,获取与所述回放类型对应的回放视频流包括:
    在回放类型为第二回放类型的情况下,根据所述基准视频流,获取s i和e i之间的与每帧图像对应的机位角度;
    根据各个机位角度,确定第i个基准事件帧区间对应的事件发生的区域;
    获取位于所述区域的至少一个第二辅助机位的s i和e i之间的视频流,作为所述回放视频流。
  13. 根据权利要求1-4、6-12任一项所述的导播方法,其中,所述至少一个基准事件帧区间包括第i个基准事件帧区间和第i+1个基准事件帧区间,i为大于或等于1的整数,生成导播序列包括:
    在第i个基准事件帧区间的局部机位序列的结束帧标识E i与第i+1个基准事件帧区间的局部机位序列的起始帧标识S i+1不相邻的情况下,生成补充序列,所述补充序列包括位于E i和S i+1之间的每帧图像的机位和帧标识,位于E i和S i+1之间的每帧图像的机位为第三辅助机位;
    合并各个局部序列和补充序列,得到导播序列。
  14. 根据权利要求13所述的导播方法,其中,基准机位用于提供带球运动员的特写视频流,第一辅助机位用于提供球场上不同角度的特写视频流,第二辅助机位用于提供球场上不同角度的标准视频流,第三辅助机位用于提供观众视角的标准视频流。
  15. 根据权利要求1所述的导播方法,其中,生成导播视频包括:
    根据导播序列和相应机位的视频流,获取与导播序列对应的各帧图像;
    对各帧图像进行编码,得到导播视频。
  16. 一种导播装置,包括:
    获取模块,被配置为获取来自基准机位的基准视频流;
    事件识别模块,被配置为对所述基准视频流进行事件识别,得到至少一个基准事件帧区间,每个基准事件帧区间对应唯一的事件,每个基准事件帧区间包括多个连续的、发生相同事件的图像的帧标识;
    确定模块,被配置为根据事件与机位标识的对应关系,确定所述每个基准事件帧区间的局部序列,所述局部序列包括与该基准事件帧区间对应的待播放视频的每帧图像的机位标识和与该机位标识对应的帧标识;
    第一生成模块,被配置为根据各个局部序列,生成导播序列;
    第二生成模块,被配置为根据导播序列和相应机位的视频流,生成导播视频。
  17. 一种导播装置,包括:
    存储器;以及
    耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令,执行如权利要求1至15任一项所述的导播方法。
  18. 一种导播系统,包括:
    如权利要求16-17任一项所述的导播装置;和
    至少一个摄像机,被配置为生成视频流,并将视频流发送到所述导播装置。
  19. 一种计算机可存储介质,其上存储有计算机程序指令,该指令被处理器执行 时实现如权利要求1至15任一项所述的导播方法。
PCT/CN2021/093223 2020-05-29 2021-05-12 导播方法、装置及系统 WO2021238653A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/999,984 US20230209141A1 (en) 2020-05-29 2021-05-12 Broadcast directing method, apparatus and system
JP2022573344A JP2023527218A (ja) 2020-05-29 2021-05-12 放送演出方法、装置、およびシステム
EP21812681.1A EP4145834A4 (en) 2020-05-29 2021-05-12 BROADCAST ALIGNMENT METHOD, APPARATUS AND SYSTEM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010477406.4 2020-05-29
CN202010477406.4A CN111787341B (zh) 2020-05-29 2020-05-29 导播方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2021238653A1 true WO2021238653A1 (zh) 2021-12-02

Family

ID=72754458

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/093223 WO2021238653A1 (zh) 2020-05-29 2021-05-12 导播方法、装置及系统

Country Status (5)

Country Link
US (1) US20230209141A1 (zh)
EP (1) EP4145834A4 (zh)
JP (1) JP2023527218A (zh)
CN (1) CN111787341B (zh)
WO (1) WO2021238653A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111787341B (zh) * 2020-05-29 2023-12-05 北京京东尚科信息技术有限公司 导播方法、装置及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004014061A2 (en) * 2002-08-02 2004-02-12 University Of Rochester Automatic soccer video analysis and summarization
US20080138029A1 (en) * 2004-07-23 2008-06-12 Changsheng Xu System and Method For Replay Generation For Broadcast Video
CN108540817A (zh) * 2018-05-08 2018-09-14 成都市喜爱科技有限公司 视频数据处理方法、装置、服务器及计算机可读存储介质
CN109714644A (zh) * 2019-01-22 2019-05-03 广州虎牙信息科技有限公司 一种视频数据的处理方法、装置、计算机设备和存储介质
CN110049345A (zh) * 2019-03-11 2019-07-23 北京河马能量体育科技有限公司 一种多视频流导播方法及导播处理系统
CN110996138A (zh) * 2019-12-17 2020-04-10 腾讯科技(深圳)有限公司 一种视频标注方法、设备及存储介质
CN111787341A (zh) * 2020-05-29 2020-10-16 北京京东尚科信息技术有限公司 导播方法、装置及系统

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8274564B2 (en) * 2006-10-13 2012-09-25 Fuji Xerox Co., Ltd. Interface for browsing and viewing video from multiple cameras simultaneously that conveys spatial and temporal proximity
US9235765B2 (en) * 2010-08-26 2016-01-12 Blast Motion Inc. Video and motion event integration system
US8799300B2 (en) * 2011-02-10 2014-08-05 Microsoft Corporation Bookmarking segments of content
US9372531B2 (en) * 2013-03-12 2016-06-21 Gracenote, Inc. Detecting an event within interactive media including spatialized multi-channel audio content
US9241103B2 (en) * 2013-03-15 2016-01-19 Voke Inc. Apparatus and method for playback of multiple panoramic videos with control codes
US11165994B2 (en) * 2013-05-13 2021-11-02 Texas Instruments Incorporated Analytics-driven summary views for surveillance networks
WO2014192487A1 (ja) * 2013-05-29 2014-12-04 日本電気株式会社 多眼撮像システム、取得画像の合成処理方法、及びプログラム
CN105659170B (zh) * 2013-06-27 2019-02-12 Abb瑞士股份有限公司 用于向远程用户传送视频的方法及视频通信装置
US10075680B2 (en) * 2013-06-27 2018-09-11 Stmicroelectronics S.R.L. Video-surveillance method, corresponding system, and computer program product
US20150116501A1 (en) * 2013-10-30 2015-04-30 Sony Network Entertainment International Llc System and method for tracking objects
KR102105189B1 (ko) * 2013-10-31 2020-05-29 한국전자통신연구원 관심 객체 추적을 위한 다중 카메라 동적 선택 장치 및 방법
US20150128174A1 (en) * 2013-11-04 2015-05-07 Broadcom Corporation Selecting audio-video (av) streams associated with an event
US9600723B1 (en) * 2014-07-03 2017-03-21 Google Inc. Systems and methods for attention localization using a first-person point-of-view device
CN104581380B (zh) * 2014-12-30 2018-08-31 联想(北京)有限公司 一种信息处理的方法及移动终端
US10129582B2 (en) * 2015-06-30 2018-11-13 Kempt, LLC Systems, methods, and computer program products for capturing spectator content displayed at live events
US9916867B2 (en) * 2016-01-06 2018-03-13 Hulu, LLC Video capture and testing of media player playback of a video
EP3526794A1 (en) * 2016-10-14 2019-08-21 Rovi Guides, Inc. Systems and methods for providing a slow motion video stream concurrently with a normal-speed video stream upon detection of an event
CN106412677B (zh) * 2016-10-28 2020-06-02 北京奇虎科技有限公司 一种回放视频文件的生成方法和装置
KR102117686B1 (ko) * 2016-11-01 2020-06-01 주식회사 케이티 영상 제공 서버, 영상 제공 방법 및 사용자 단말
CN108632661A (zh) * 2017-03-17 2018-10-09 北京京东尚科信息技术有限公司 播放方法和播放装置
CN107147920B (zh) * 2017-06-08 2019-04-12 简极科技有限公司 一种多源视频剪辑播放方法及系统
CN109326310B (zh) * 2017-07-31 2022-04-08 西梅科技(北京)有限公司 一种自动剪辑的方法、装置及电子设备
US10951879B2 (en) * 2017-12-04 2021-03-16 Canon Kabushiki Kaisha Method, system and apparatus for capture of image data for free viewpoint video
CN109194978A (zh) * 2018-10-15 2019-01-11 广州虎牙信息科技有限公司 直播视频剪辑方法、装置和电子设备
CN110381366B (zh) * 2019-07-09 2021-12-17 新华智云科技有限公司 赛事自动化报道方法、系统、服务器及存储介质
CN110798692A (zh) * 2019-09-27 2020-02-14 咪咕视讯科技有限公司 一种视频直播方法、服务器及存储介质
CN110933460B (zh) * 2019-12-05 2021-09-07 腾讯科技(深圳)有限公司 视频的拼接方法及装置、计算机存储介质
CN110944123A (zh) * 2019-12-09 2020-03-31 北京理工大学 一种体育赛事智能导播方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004014061A2 (en) * 2002-08-02 2004-02-12 University Of Rochester Automatic soccer video analysis and summarization
US20080138029A1 (en) * 2004-07-23 2008-06-12 Changsheng Xu System and Method For Replay Generation For Broadcast Video
CN108540817A (zh) * 2018-05-08 2018-09-14 成都市喜爱科技有限公司 视频数据处理方法、装置、服务器及计算机可读存储介质
CN109714644A (zh) * 2019-01-22 2019-05-03 广州虎牙信息科技有限公司 一种视频数据的处理方法、装置、计算机设备和存储介质
CN110049345A (zh) * 2019-03-11 2019-07-23 北京河马能量体育科技有限公司 一种多视频流导播方法及导播处理系统
CN110996138A (zh) * 2019-12-17 2020-04-10 腾讯科技(深圳)有限公司 一种视频标注方法、设备及存储介质
CN111787341A (zh) * 2020-05-29 2020-10-16 北京京东尚科信息技术有限公司 导播方法、装置及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4145834A4 *

Also Published As

Publication number Publication date
EP4145834A1 (en) 2023-03-08
CN111787341B (zh) 2023-12-05
JP2023527218A (ja) 2023-06-27
US20230209141A1 (en) 2023-06-29
CN111787341A (zh) 2020-10-16
EP4145834A4 (en) 2024-03-20

Similar Documents

Publication Publication Date Title
WO2017038541A1 (ja) 映像処理装置、映像処理方法、及び、プログラム
US8599317B2 (en) Scene recognition methods for virtual insertions
US9167221B2 (en) Methods and systems for video retargeting using motion saliency
JP7034666B2 (ja) 仮想視点画像の生成装置、生成方法及びプログラム
US10958854B2 (en) Computer-implemented method for generating an output video from multiple video sources
US8873861B2 (en) Video processing apparatus and method
US8773555B2 (en) Video bit stream extension by video information annotation
Ariki et al. Automatic production system of soccer sports video by digital camera work based on situation recognition
Grana et al. Linear transition detection as a unified shot detection approach
US8973029B2 (en) Backpropagating a virtual camera to prevent delayed virtual insertion
US8488887B2 (en) Method of determining an image distribution for a light field data structure
WO2021139728A1 (zh) 全景视频处理方法、装置、设备及存储介质
JP2009505553A (ja) ビデオストリームへの視覚効果の挿入を管理するためのシステムおよび方法
WO2021238653A1 (zh) 导播方法、装置及系统
Halperin et al. Egosampling: Wide view hyperlapse from egocentric videos
WO2021017496A1 (zh) 导播方法、装置及计算机可读存储介质
KR102299565B1 (ko) 실시간 방송 영상에서 실시간으로 인물 객체를 인식하고 영상 처리하는 방법 및 이러한 방법을 수행하는 장치
JP2005223487A (ja) デジタルカメラワーク装置、デジタルカメラワーク方法、及びデジタルカメラワークプログラム
Kawamura et al. Rsviewer: An efficient video viewer for racquet sports focusing on rally scenes.
Duan et al. Flad: a human-centered video content flaw detection system for meeting recordings
Wang et al. Personal multi-view viewpoint recommendation based on trajectory distribution of the viewing target
Zhang et al. Coherent video generation for multiple hand-held cameras with dynamic foreground
Itazuri et al. Court-based volleyball video summarization focusing on rally scene
Wang et al. Pixel-wise video stabilization
Mitra et al. A flexible scheme for state assignment based on characteristics of the FSM

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21812681

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022573344

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021812681

Country of ref document: EP

Effective date: 20221202

NENP Non-entry into the national phase

Ref country code: DE