WO2018223554A1 - 一种多源视频剪辑播放方法及系统 - Google Patents

一种多源视频剪辑播放方法及系统 Download PDF

Info

Publication number
WO2018223554A1
WO2018223554A1 PCT/CN2017/102172 CN2017102172W WO2018223554A1 WO 2018223554 A1 WO2018223554 A1 WO 2018223554A1 CN 2017102172 W CN2017102172 W CN 2017102172W WO 2018223554 A1 WO2018223554 A1 WO 2018223554A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
screen
processing unit
coordinate system
positioning information
Prior art date
Application number
PCT/CN2017/102172
Other languages
English (en)
French (fr)
Inventor
吴建成
张也雷
韩步勇
罗向望
郭岱硕
Original Assignee
简极科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 简极科技有限公司 filed Critical 简极科技有限公司
Publication of WO2018223554A1 publication Critical patent/WO2018223554A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Definitions

  • the present invention relates to the field of video data processing technologies, and in particular, to a multi-source video clip playing method and system.
  • Ball sports is the most extensive sport in the world and is loved by fans.
  • some wonderful lenses cannot be intelligently captured and switched by the video capture device.
  • the manual switching is adopted for the switching of exciting events, and the artificial method is easy to cause delay and miss the wonderful event, and it is impossible to accurately switch to the optimal viewing angle of the wonderful event.
  • Some of the current prior art methods use the visual information to perform wonderful event detection to perform video editing or switching according to the detection result, but mainly based on the goal and the type of the lens for recognition and determination, and the wonderful events recognized by the camera are more likely to occur. It is impossible to accurately detect and perform intelligent video clip switching to the goal or near the restricted area, and other events such as accurate pass coordination, scramble for fouls, etc. that do not occur in the restricted area or near the goal.
  • a multi-source video clip playing method and system which can perform on-line real-time detection of the positioning information sent by the ball and the wearable device on the player, and restore the accuracy of the ball and the player in the video picture according to the precise positioning algorithm. Position and perform clip playback of the video according to changes in the position information of the ball and the player.
  • the present invention provides a multi-source video clip playing method, comprising the following steps:
  • Step 1 Perform real-time lens capture on the stadium, acquire the synchronized video stream of the entire stadium, and save the video stream locally, and save it to the local video storage unit of the video server in real time;
  • Step 2 Obtain the positioning information on the course ball and the player's course position coordinate system, and The positioning information is transmitted to the video server and mapped into positioning information on the graphic picture coordinate system;
  • Step 3 Acquire a direction angle of the ball or the player according to the positioning information on the course position coordinate system, and generate a switching screen instruction switching screen instruction according to the direction angle, and switch the screen instruction to the partial screen according to the switching screen instruction;
  • Step 4 performing amplification processing on the partial image to generate an enlarged image
  • Step 5 According to the positioning information on the graphic picture coordinate system as the center point of the picture, the enlarged picture is intercepted according to the first size standard, the intercepted image is obtained, and the intercepted image is output to the display terminal.
  • the method further includes the step 21 of performing machine learning and training of the stadium event on the massive data of the historical event, and obtaining a time prediction and judgment model.
  • step 2 the positioning information on the course position coordinate system is collected by a position collecting terminal disposed on the ball and the player.
  • step 2 the positioning information on the course position coordinate system is mapped into the positioning information on the graphic picture coordinate system, and the mapping method is as follows:
  • v input [x,y,z,1] T is the position of the football/player in xoy, z is the dimension parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • step 3 according to v input, the ball or the player is in the center of the long side of the stadium, the positive direction of the x-axis is the azimuth of the 0-degree reference line, and then the switching picture is released according to the azimuth.
  • the instruction is directed to the lens corresponding to the azimuth direction, thereby realizing the switching of the partial picture.
  • step 4 the partial screen is enlarged according to the preset picture definition to generate an enlarged picture.
  • step 5 the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, the image of the first size standard is intercepted as the cut image, and the cut image is output to the display terminal for screen output.
  • the invention also provides a multi-source video clip playing system, comprising: a lens image capturing module, a ball and player positioning information acquiring module, a video server and a video live output interface, wherein a video storage unit is deployed in the video server, a screen switching processing unit, a screen enlargement processing unit and a screen tracking processing unit, wherein the lens screen capture module is connected to the video storage unit, and the ball and player positioning information acquisition module are respectively connected to the screen switching processing unit and the screen tracking processing unit, and the screen is switched.
  • the processing unit, the picture enlargement processing unit and the picture tracking processing unit are sequentially connected, the picture tracking processing unit is connected to the video storage unit, and the picture tracking processing unit is connected to the live video output interface.
  • the camera screen capture module captures the real-time footage of the stadium, acquires the synchronized video stream of the entire stadium, and saves the video stream locally, and saves it to the local video storage unit of the video server in real time.
  • the screen switching processing unit acquires the positioning information of the ball of the court and the coordinate system of the player's course position, and transmits the positioning information to the video server, and maps the positioning information on the coordinate system of the graphic image, and locates the information according to the coordinate system of the course position. Obtaining the direction angle of the ball or the player, and generating a switching screen command according to the direction angle, and switching to the partial screen according to the switching screen instruction,
  • a screen enlargement processing unit that performs enlargement processing on the partial screen to generate an enlarged screen
  • the screen tracking processing unit intercepts the enlarged image according to the first size standard according to the positioning information on the graphic screen coordinate system, obtains the captured image, and outputs the captured image to the display terminal.
  • a stadium event intelligent processing unit is further included, and the stadium event intelligent processing
  • the unit input end is connected with the ball and the player positioning information acquiring module, and the output end of the stadium event intelligent processing unit is respectively connected with the screen switching processing unit, the screen enlargement processing unit and the screen tracking processing unit, and the stadium event intelligent processing unit is further connected with an event log.
  • the course event intelligent processing unit performs machine learning and training of the course events on the massive data of the historical events, and obtains a time prediction and judgment model.
  • the positioning information on the course position coordinate system is collected by a position collecting terminal disposed on the ball and the player.
  • mapping method is as follows:
  • v input [x, y, z, 1] T for the location dimension soccer / football players in the xoy, z is parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • the ball or the player is centered on the midpoint of the long side of the stadium, and the positive direction of the x-axis is the azimuth of the 0-degree reference line, and then the switching picture command is issued to the corresponding position according to the azimuth.
  • the lens is oriented at an angle to achieve partial image switching.
  • the partial picture is enlarged to generate an enlarged picture.
  • the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, the image of the first size standard is intercepted as the captured image, and the cut image is output to the display terminal for screen output.
  • the above technical solution designs a multi-source video clip playing method and system, according to the real-time multi-source lens screen capture, and stores the acquired synchronous video stream locally, and then according to the ball and the player in the double
  • the position mapping relationship in the coordinate system realizes the matching of the position of the ball and the player.
  • the local tactical judgment model is obtained, and the local tactical judgment model and the stadium event are performed in the wonderful event.
  • the ball or player is anchored, and then the lens is switched according to the direction angle of the stadium position coordinate system, and then the image interception and output are realized according to the positioning information on the graphic picture coordinate system, and finally the beautiful event after the editing is output.
  • FIG. 1 is a schematic structural view of Embodiment 1 of the present invention.
  • FIG. 2 is another schematic structural view of Embodiment 1 of the present invention.
  • FIG. 3 is a schematic structural view of Embodiment 2 of the present invention.
  • Embodiment 2 of the present invention is another schematic structural view of Embodiment 2 of the present invention.
  • a multi-source video clip playing method includes the following steps:
  • Step 1 Perform real-time lens capture on the stadium, acquire the synchronized video stream of the entire stadium, and save the video stream locally, and save it to the local video storage unit of the video server in real time;
  • Step 2 Obtain the positioning information on the course ball and the player's course position coordinate system, and The positioning information is transmitted to the video server and mapped into positioning information on the graphic picture coordinate system; the positioning information on the course position coordinate system is collected by the position collecting terminal disposed on the ball and the player.
  • the positioning information on the course position coordinate system is mapped into the positioning information on the graphic picture coordinate system, and the mapping method is as follows:
  • v input [x,y,z,1] T is the position of the football/player in xoy, z is the dimension parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • Step 3 Acquire the direction angle of the ball or the player according to the positioning information on the course position coordinate system, and generate a switching picture instruction according to the direction angle, and switch to the partial picture according to the switching picture instruction; specifically: calculate the ball according to the v input Or the player takes the midpoint of the long side of the stadium as the center, the positive direction of the x-axis is the azimuth of the 0-degree reference line, and then according to the azimuth angle, the switch screen command is given to the lens corresponding to the azimuth direction, thereby realizing the partial picture. Switch.
  • Step 4 The partial screen is enlarged according to a preset picture definition to generate an enlarged picture.
  • Step 5 According to the positioning information on the graphic picture coordinate system as the center point of the picture, the enlarged picture is intercepted according to the first size standard, the intercepted image is obtained, and the intercepted image is output to the display terminal. Specifically, the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, and the image of the first size standard is intercepted as the captured image, and the captured image is output to the display terminal for screen output.
  • the screen to be switched in the process of switching the screen, may be a partial screen of a certain lens, or may be a part of the overall video after splicing a plurality of video streams by splicing means. Picture.
  • the embodiment 1 further provides a multi-source video clip playing system, including: a lens screen capturing module, a ball and player positioning information acquiring module, a video server, and a video live output interface.
  • a video storage unit a picture switching processing unit, a picture enlargement processing unit, and a picture tracking processing unit in the video server, wherein the lens picture capturing module is connected to the video storage unit, and the ball and the player positioning information acquiring module respectively switch to the screen
  • the processing unit is connected to the picture tracking processing unit, the picture switching processing unit, the picture enlargement processing unit and the picture tracking processing unit are sequentially connected, the picture tracking processing unit is connected to the video storage unit, and the picture tracking processing unit is connected to the video live output interface.
  • the camera screen capture module captures the real-time footage of the stadium, acquires the synchronized video stream of the entire stadium, and saves the video stream locally, and saves it to the local video storage unit of the video server in real time.
  • the screen switching processing unit acquires the positioning information of the ball of the court and the coordinate system of the player's course position, and transmits the positioning information to the video server, and maps the positioning information on the coordinate system of the graphic image, and locates the information according to the coordinate system of the course position. Obtaining a direction angle of the ball or the player, and generating a switching screen command according to the direction angle, and switching to a partial screen according to the switching screen instruction, wherein the positioning information on the course position coordinate system is set by the position collecting terminal disposed on the ball and the player collection.
  • the positioning information on the course position coordinate system is mapped into the positioning information on the graphic picture coordinate system, and the mapping method is as follows:
  • v input [x,y,z,1] T is the position of the football/player in xoy, and z is the dimension parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • a picture enlargement processing unit that performs an enlargement process on the partial picture according to a preset picture definition to generate an enlarged picture
  • the screen tracking processing unit intercepts the enlarged image according to the first size standard according to the positioning information on the graphic screen coordinate system, obtains the captured image, and outputs the captured image to the display terminal.
  • the ball or the player is centered on the midpoint of the long side of the court, and the azimuth of the x-axis positive direction is 0 degree reference line, and then the lens is commanded according to the azimuth angle to the lens corresponding to the azimuth direction.
  • the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, and the image of the first size standard is intercepted as the captured image, and the captured image is output to the display terminal for screen output.
  • Player A starts to take the ball to the middle goal from the midline of the course, and the positioning system locates the A player at the position (x1, y1, z1), according to formula (1),
  • the screen to be switched in the process of switching the screen, may be a partial screen of a certain lens, or multiple video streams may be streamed through the splicing means.
  • the method for playing a multi-source video clip includes the following steps:
  • Step 1 Perform real-time lens capture on the stadium, acquire the synchronized video stream of the entire stadium, and save the video stream locally, and save it to the local video storage unit of the video server in real time;
  • Step 2 Obtain the positioning information of the ball of the court and the coordinate system of the player's course position, and transmit the positioning information to the video server, and map the positioning information on the coordinate system of the graphic image; the positioning information on the coordinate system of the course passes Set the position acquisition on the ball and the player.
  • the positioning information on the course position coordinate system is mapped into the positioning information on the graphic picture coordinate system, and the mapping method is as follows:
  • v input [x,y,z,1] T is the position of the football/player in xoy, z is the dimension parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • step 21 the machine learning and training of the stadium event is performed on the massive data of the historical event, and the local tactical judgment model is obtained.
  • Step 3 Acquire the direction angle of the ball or the player according to the positioning information on the course position coordinate system, and generate a switching picture instruction according to the direction angle, and switch to the local screen according to the switching picture instruction.
  • the ball or the player is in the center of the long side of the stadium, the positive direction of the x-axis is the azimuth of the 0-degree reference line, and then the switching screen command is issued according to the azimuth. To the lens corresponding to the azimuth direction, thereby switching the partial picture.
  • Step 4 performing amplification processing on the partial image according to a preset picture definition to generate an enlarged picture
  • Step 5 According to the positioning information on the graphic picture coordinate system as the center point of the picture, the enlarged picture is intercepted according to the first size standard, the intercepted image is obtained, and the intercepted image is output to the display terminal. Specifically, the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, and the image of the first size standard is intercepted as the captured image, and the captured image is output to the display terminal for screen output.
  • the second embodiment further provides a multi-source video clip playing system, comprising: a lens screen capturing module, a ball and player positioning information acquiring module, a video server and a video live output interface, and deploying video storage in the video server a unit, a picture switching processing unit, a picture enlargement processing unit and a picture tracking processing unit, wherein the lens picture capturing module is connected to the video storage unit, and the ball and player positioning information acquiring module are respectively connected to the picture switching processing unit and the picture tracking processing unit,
  • the screen switching processing unit, the screen enlargement processing unit and the screen tracking processing unit are sequentially connected, the screen tracking processing unit is connected to the video storage unit, and the screen tracking processing unit is connected to the live broadcast output interface.
  • the camera screen capture module captures the real-time footage of the stadium, acquires the synchronized video stream of the entire stadium, and saves the video stream locally, and saves it to the local video storage unit of the video server in real time.
  • the screen switching processing unit acquires the positioning information of the ball of the court and the coordinate system of the player's course position, and transmits the positioning information to the video server, and maps the positioning information on the coordinate system of the graphic image.
  • the positioning information on the course position coordinate system is mapped into the positioning information on the graphic picture coordinate system, and the mapping method is as follows:
  • v input [x,y,z,1] T is the position of the football/player in xoy, z is the dimension parallel to the optical axis;
  • M in is a scaling matrix from xoy scale units to uov scale units
  • M ext is a rotation and translation matrix of position coordinates
  • R is a rotation matrix of position coordinates
  • R z R x is a rotation matrix of position coordinates around z and x axes, respectively;
  • T is the translation matrix of the position coordinates.
  • the embodiment further includes a cloud mass data server for connecting to the stadium event intelligent processing unit to store massive stadium event data.
  • the stadium event intelligent processing unit is connected to the ball and the player positioning information acquiring module, and the output of the stadium event intelligent processing unit is respectively connected to the screen switching processing unit, the screen enlargement processing unit and the screen tracking processing unit.
  • the stadium event intelligent processing unit is further connected with an event log, and the stadium event intelligent processing unit performs machine learning and training of the stadium event on the massive data of the historical event to obtain a local tactical judgment model.
  • the intelligent processing unit of the stadium event the intelligent prediction, judgment, efficient and timely processing and live video broadcast of the events on the court (dribbling, passing, stealing, intercepting and shooting) are realized by using machine learning algorithms combined with high-performance parallel computing technology.
  • the specific steps are as follows:
  • the screen switching processing unit acquires a direction angle of the ball or the player according to the positioning information on the course position coordinate system, and generates a switching screen instruction according to the direction angle, and switches to a partial screen according to the switching screen instruction, and the positioning on the course position coordinate system
  • the information is collected by a location acquisition terminal set on the ball and the player. Specifically, according to the v input, the ball or the player is centered on the midpoint of the long side of the golf course, and the positive direction of the x-axis is the azimuth of the 0-degree reference line, and then the switching picture command is issued according to the azimuth angle to the corresponding azimuth direction.
  • the lens thus achieves the switching of the partial picture.
  • a picture enlargement processing unit that performs an enlargement process on the partial picture according to a preset picture definition to generate an enlarged picture
  • the screen tracking processing unit intercepts the enlarged image according to the first size standard according to the positioning information on the graphic screen coordinate system, obtains the captured image, and outputs the captured image to the display terminal.
  • the enlarged picture is intercepted according to the v output , and (u, v) is taken as the center of the picture, and the image of the first size standard is intercepted as the captured image, and the captured image is output to the display terminal for screen output.
  • the data collected in a single game is a sequence of time, and the intelligent judgment of the event is divided into two situations:
  • the first type is to take the important event calculated by the event processing unit as the main editing event, for example, the event of the shooting is an important event, and the video of the shooting is automatically generated in the time period before and after the event occurs;
  • the second is to use the local offensive tactics composed of consecutive events as the main editing event.
  • the local offensive tactics (transmission and matching, cross-covering cooperation, two-on-one coordination, positioning ball tactics, etc.) are two or more.
  • the player completes the offensive tactics.
  • an offensive tactic is composed of a series of player positions, ball positions and events.
  • This tactical judgment model utilizes a cyclic neural network for local offensive tactical detection and judgment.
  • All positions of the player and the ball at time point i Pos i ⁇ pos 1,i ,pos 2,i ,...pos 22,i ,pos b,i ⁇ , together with the event Event i calculated by the event processing unit ( Such as dribble drib i , pass pass i , shot shot i, etc.) is a time unit to cycle the input of the neural network.
  • the output O i of the neural network is the type of offensive tactics (transmission and matching, cross-covering cooperation, two-to-one coordination, positioning ball tactics, etc.) and detection and judgment. If an offensive tactic is detected, a video of the offensive tactic is automatically generated before and after the tactic occurs.
  • the system uses the tactical judgment model to input player coordinates and events from i begin to i end .
  • the tactical judgment model automatically generates a video of the offensive tactics from i begin to i end after detecting the cut and match.
  • the above technical solution designs a multi-source video clip playing method and system, according to the real-time multi-source lens screen capture, and stores the acquired synchronous video stream locally, and then according to the ball and the player in the double
  • the position mapping relationship in the coordinate system realizes the matching of the position of the ball and the player.
  • the time prediction and judgment model is obtained, and the exciting event is performed according to the time prediction and the judgment model.
  • the ball or player is anchored, and then the lens is switched according to the direction angle of the stadium position coordinate system, and then the image interception and output are realized according to the positioning information on the graphic picture coordinate system, and finally the beautiful event after the editing is output.
  • the computer device includes but is not limited to: a personal computer, a server, a general purpose computer, a special purpose computer, a network device, an embedded device, a programmable device, a smart mobile terminal, a smart home device, a wearable smart device, a vehicle smart device, and the like;
  • the storage medium includes, but is not limited to, a RAM, a ROM, a magnetic disk, a magnetic tape, an optical disk, a flash memory, a USB flash drive, a mobile hard disk, a memory card, a memory stick, a network server storage, a network cloud storage, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

本发明涉及视频数据处理技术领域,具体涉及一种多源视频剪辑播放方法及系统。通过设计一种多源视频剪辑播放方法及系统,根据对实时多源镜头画面捕捉后,并将获取的同步视频流进行本地存储,再根据球和球员在双坐标系中的位置映射关系实现球和球员的位置的匹配,通过对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型,根据时间预测和判断模型进行精彩事件中的球或球员的锚定,进而根据球场位置坐标系的方向角来实现镜头的切换,再根据图形画面坐标系上的定位信息来实现图像截取及输出,最终输出剪辑后的精彩赛事。本发明通过球场事件智能处理单元,实现球场上的精彩事件智能预测、判断与高效及时处理与视频直播。

Description

一种多源视频剪辑播放方法及系统 技术领域
本发明涉及视频数据处理技术领域,具体涉及一种多源视频剪辑播放方法及系统。
背景技术
球类运动是世界上最广泛的体育运动,深受广大球迷的喜爱。而球类运动视频实时转播过程中,由于场地的广阔性以及运动球员数量较多,因而一些精彩的镜头无法被视频采集装置智能捕捉和切换。目前的球类赛事直播过程中,对于精彩事件的切换采用人工导播切换的方式,而人为的方式容易造成时延和错过精彩事件,也无法精准地切换到精彩事件的最佳观看角度。目前的现有技术中也有一些是通过视觉信息进行精彩事件检测从而根据检测结果来进行视频剪辑或切换,但是其主要是根据球门和镜头类型来进行识别和判定,其识别的精彩事件更多发生至球门或者禁区附近,而其他区域的精彩事件,例如精准传球配合、争抢犯规等没有发生在禁区或者球门附近的事件,则无法准确检测判定并进行智能视频剪辑切换。
发明内容
为此,需要提供一种多源视频剪辑播放方法及系统,通过对球和球员身上的可穿戴设备发送的定位信息进行在线实时检测,根据精准的定位算法还原球和球员在视频画面中的精确位置,并根据球和球员的位置信息的变化来实现对视频的剪辑播放。
为实现上述目的,本发明提供了一种多源视频剪辑播放方法,包括以下步骤:
步骤1:对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元;
步骤2:获取球场的球和球员的球场位置坐标系上定位信息,并将 该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息;
步骤3:根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令切换画面指令,根据该切换画面指令切换画面指令切换至局部画面;
步骤4:对该局部画面进行进行放大处理,生成放大画面;
步骤5:根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。
进一步的,还包括步骤21,对历史赛事的海量数据进行球场事件的机器学习和训练,获得时间预测和判断模型。
进一步的,步骤2中,所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。
更进一步的,步骤2中,将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput          (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
更进一步的,步骤3中,根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
更进一步的,步骤4中,根据预设的画面清晰度,对局部画面进行进行放大处理,生成放大画面。
更进一步的,步骤5中,根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
本发明还提供了一种多源视频剪辑播放系统,包括:镜头画面捕捉模组、球和球员定位信息获取模组、视频服务器和视频直播输出接口,在所述视频服务器内部署视频存储单元、画面切换处理单元、画面放大处理单元和画面追踪处理单元,其中镜头画面捕捉模组与视频存储单元相连,球和球员定位信息获取模组分别与画面切换处理单元和画面追踪处理单元相连,画面切换处理单元、画面放大处理单元和画面追踪处理单元依次相连,画面追踪处理单元与视频存储单元相连,画面追踪处理单元与视频直播输出接口相连,
镜头画面捕捉模组对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元,
画面切换处理单元,获取球场的球及球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息,根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令,根据该切换画面指令切换至局部画面,
画面放大处理单元,对该局部画面进行进行放大处理,生成放大画面,
画面追踪处理单元,根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。
进一步的,还包括球场事件智能处理单元,所述球场事件智能处理 单元输入端与球和球员定位信息获取模组相连,球场事件智能处理单元输出端分别与画面切换处理单元、画面放大处理单元和画面追踪处理单元相连,球场事件智能处理单元还连接有事件日志,所述球场事件智能处理单元对历史赛事的海量数据进行球场事件的机器学习和训练,获得时间预测和判断模型。
进一步的,所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。
更进一步的,将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
定义球场位置坐标系为xoy坐标系,定义图形画面坐标系为uov坐标系:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput           (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
更进一步的,根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
更进一步的,根据预设的画面清晰度,对局部画面进行进行放大处理,生成放大画面。
更进一步的,根据voutput截取放大画面,以(u,v)作为画面中心,截取 第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
区别于现有技术,上述技术方案通过设计一种多源视频剪辑播放方法及系统,根据对实时多源镜头画面捕捉后,并将获取的同步视频流进行本地存储,再根据球和球员在双坐标系中的位置映射关系实现球和球员的位置的匹配,通过对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型,根据局部战术判断模型与球场事件进行精彩事件中的球或球员的锚定,进而根据球场位置坐标系的方向角来实现镜头的切换,再根据图形画面坐标系上的定位信息来实现图像截取及输出,最终输出剪辑后的精彩赛事。
本发明能够实现
附图说明
图1为本发明实施例1的结构示意图。
图2为本发明实施例1的另一结构示意图。
图3为本发明实施例2中的结构示意图。
图4为本发明实施例2中的另一结构示意图。
具体实施方式
为详细说明技术方案的技术内容、构造特征、所实现目的及效果,以下结合具体实施例并配合附图详予说明。
实施例1:
请参阅图1和图2,本实施例为实现上述目的,一种多源视频剪辑播放方法,包括以下步骤:
步骤1:对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元;
步骤2:获取球场的球和球员的球场位置坐标系上定位信息,并将 该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息;所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。
将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput            (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
步骤3:根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令,根据该切换画面指令切换至局部画面;具体地:根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
步骤4:根据预设的画面清晰度,对该局部画面进行进行放大处理,生成放大画面。
步骤5:根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。具体地根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
在本实施例的方法中,对画面进行切换的过程中,其切换的画面可以是某一镜头的某一局部画面,也可以是通过拼接手段将多个视频流进行拼接后的整体视频的局部画面。
参考图1和图2所示,本实施例1还提供了一种多源视频剪辑播放系统,包括:镜头画面捕捉模组、球和球员定位信息获取模组、视频服务器和视频直播输出接口,在所述视频服务器内部署视频存储单元、画面切换处理单元、画面放大处理单元和画面追踪处理单元,其中镜头画面捕捉模组与视频存储单元相连,球和球员定位信息获取模组分别与画面切换处理单元和画面追踪处理单元相连,画面切换处理单元、画面放大处理单元和画面追踪处理单元依次相连,画面追踪处理单元与视频存储单元相连,画面追踪处理单元与视频直播输出接口相连,
镜头画面捕捉模组对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元,
画面切换处理单元,获取球场的球及球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息,根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令,根据该切换画面指令切换至局部画面,所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。
将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
定义球场位置坐标系为xoy坐标系,定义图形画面坐标系为uov坐标系:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput            (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴 的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
画面放大处理单元,根据预设的画面清晰度,对该局部画面进行进行放大处理,生成放大画面,
画面追踪处理单元,根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
举例:球员A从球场中线开始带球往中路球门前进,定位系统定位A球员在位置(x1,y1,z1),根据公式(1),
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput
计算出在正对球场中线摄像头的画面坐标(u1,v1),此时画面撷取以(u1,v1)为中心对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端;当球员A带球接近到球门的位置(x2,y2,z2),此时,根据公式(1)计算正对球门的摄像头的画面坐标(u2,v2),并切换(u2,v2)为中心对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。
在本实施例的方法中,对画面进行切换的过程中,其切换的画面可以是某一镜头的某一局部画面,也可以是通过拼接手段将多个视频流进 行拼接后的整体视频的局部画面。
实施例2:
参考图3和图4所示,本实施例2中,该一种多源视频剪辑播放方法,包括以下步骤:
步骤1:对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元;
步骤2:获取球场的球和球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息;所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。
将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput           (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
步骤21,对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型。
步骤3:根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令,根据该切换画面指令切换至局 部画面。具体地:步骤3中,根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
步骤4:根据预设的画面清晰度,对该局部画面进行进行放大处理,生成放大画面;
步骤5:根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。具体地,根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
本实施例2还提供了一种多源视频剪辑播放系统,包括:镜头画面捕捉模组、球和球员定位信息获取模组、视频服务器和视频直播输出接口,在所述视频服务器内部署视频存储单元、画面切换处理单元、画面放大处理单元和画面追踪处理单元,其中镜头画面捕捉模组与视频存储单元相连,球和球员定位信息获取模组分别与画面切换处理单元和画面追踪处理单元相连,画面切换处理单元、画面放大处理单元和画面追踪处理单元依次相连,画面追踪处理单元与视频存储单元相连,画面追踪处理单元与视频直播输出接口相连,
镜头画面捕捉模组对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元,
画面切换处理单元,获取球场的球及球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息,
将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
定义球场位置坐标系为xoy坐标系,定义图形画面坐标系为uov坐 标系:
voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput          (1)
其中:
vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
voutput=[u,v,1]T为足球/球员在uov中的位置;
Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
Mext为位置坐标的旋转与平移矩阵;
R为位置坐标的旋转矩阵;
RzRx为位置坐标分别绕z,x轴的旋转矩阵;
T为位置坐标的平移矩阵。
参考图3所示,本实施例中还包括云端海量数据服务器,用于与球场事件智能处理单元连接,存储海量球场事件数据。
球场事件智能处理单元,所述球场事件智能处理单元输入端与球和球员定位信息获取模组相连,球场事件智能处理单元输出端分别与画面切换处理单元、画面放大处理单元和画面追踪处理单元相连,球场事件智能处理单元还连接有事件日志,所述球场事件智能处理单元对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型。通过球场事件智能处理单元,实现球场上的事件(运球、传球、抢断、拦截和射门)智能预测、判断与高效及时处理与视频直播,其核心在于利用机器学习算法结合高性能并行计算技术,达到实时高效的分析与处理,请参阅图3和图4,具体步骤如下:
(1)根据历史赛事的海量数据进行球场事件的机器学习和训练,以得到较高准确性的战术预测和判断模型;
(2)在“实现球场上球/球员(对象)的画面切换与追踪”的基础上,不再是直接根据球/球员定位信息来实现画面切换与追踪,而是通过球场智能处理单元这一中间处理模块,实现基于智能化的事件预测与 判断的画面切换、缩放和追踪处理,实现更好的视觉体验。
画面切换处理单元根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令,根据该切换画面指令切换至局部画面,所述球场位置坐标系上定位信息通过设置在球和球员上的位置采集终端采集。具体地根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
画面放大处理单元,根据预设的画面清晰度,对该局部画面进行进行放大处理,生成放大画面,
画面追踪处理单元,根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
单场球赛收集的数据以时间为序列,针对赛事事件的智能判断分为二种情况做处理:
第一种是以事件处理单元计算出的重要事件当作主要剪辑事件,例如射门的事件为重要事件,在事件发生的前后时间段自动生成射门的视频;
第二种是以连续事件所组合成的局部进攻战术当作主要剪辑事件.局部进攻战术(传切配合,交叉掩护配合,二过一配合,定位球战术等)是由两名或两名以上球员完成进攻配合的战术。换句话说一个进攻战术是由一连串时间续列的球员位置,球位置与事件所组成的。本战术判断模型利用循环神经网络进行局部进攻战术侦测与判断。所有在时间点i上球员与球的位置Posi={pos1,i,pos2,i,...pos22,i,posb,i}、连同事件处理单元计算出的事件Eventi(如运球dribi、传球passi、射门shoti等) 为一个时间单元循环神经网络的输入。而神经网络的输出Oi则是进攻战术的种类(传切配合,交叉掩护配合,二过一配合,定位球战术等)与侦测判断。若侦测到进攻战术,则在战术发生的前后时间段自动生成进攻战术的视频。
举例:在时间点ibegin到iend主队球员P1在边路运球并将球传给在中路切入的进攻球员P2。系统利用战术判断模型从ibegin到iend输入球员坐标及事件.在时间点iend时战术判断模型侦测到传切配合后便自动生成从ibegin到iend的进攻战术的视频。
区别于现有技术,上述技术方案通过设计一种多源视频剪辑播放方法及系统,根据对实时多源镜头画面捕捉后,并将获取的同步视频流进行本地存储,再根据球和球员在双坐标系中的位置映射关系实现球和球员的位置的匹配,通过对历史赛事的海量数据进行球场事件的机器学习和训练,获得时间预测和判断模型,根据时间预测和判断模型进行精彩事件中的球或球员的锚定,进而根据球场位置坐标系的方向角来实现镜头的切换,再根据图形画面坐标系上的定位信息来实现图像截取及输出,最终输出剪辑后的精彩赛事。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括……”或“包含……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的要素。此外,在本文中,“大于”、“小于”、“超过”等理解为不包括本数;“以上”、“以下”、“以内”等理解为包括本数。
本领域内的技术人员应明白,上述各实施例可提供为方法、装置、或计算机程序产品。这些实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。上述各实施例涉及的方法中的全部或部分步骤可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机设备可读取的存储介质中,用于执行上述各实施例方法所述的全部或部分步骤。所述计算机设备,包括但不限于:个人计算机、服务器、通用计算机、专用计算机、网络设备、嵌入式设备、可编程设备、智能移动终端、智能家居设备、穿戴式智能设备、车载智能设备等;所述的存储介质,包括但不限于:RAM、ROM、磁碟、磁带、光盘、闪存、U盘、移动硬盘、存储卡、记忆棒、网络服务器存储、网络云存储等。
上述各实施例是参照根据实施例所述的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到计算机设备的处理器以产生一个机器,使得通过计算机设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
尽管已经对上述各实施例进行了描述,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改,所以以上所述仅为本发明的实施例,并非因此限制本发明的专利保护范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围之内。

Claims (10)

  1. 一种多源视频剪辑播放方法,其特征在于,包括以下步骤:
    步骤1:对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元;
    步骤2:获取球场的球和球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息;
    步骤3:根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令切换画面指令,根据该切换画面指令切换至局部画面;
    步骤4:对该局部画面进行进行放大处理,生成放大画面;
    步骤5:根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。
  2. 根据权利要求1所述的一种多源视频剪辑播放方法,其特征在于:还包括步骤21,对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型。
  3. 根据权利要求1所述的一种多源视频剪辑播放方法,其特征在于:步骤2中,将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
    voutput=MinMextvinput=Min(RT)extvinput=Min(RyRxT)extvinput
    (1)
    其中:
    vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
    voutput=[u,v,1]T为足球/球员在uov中的位置;
    Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
    Mext为位置坐标的旋转与平移矩阵;
    R为位置坐标的旋转矩阵;
    Rx,Ry为位置坐标分别绕x,y轴的旋转矩阵;
    T为位置坐标的平移矩阵。
  4. 根据权利要求3所述的一种多源视频剪辑播放方法,其特征在于:步骤3中,根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
  5. 根据权利要求3所述的一种多源视频剪辑播放方法,其特征在于:
    步骤5中,根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
  6. 一种多源视频剪辑播放系统,其特征在于:包括:镜头画面捕捉模组、球和球员定位信息获取模组、视频服务器和视频直播输出接口,在所述视频服务器内部署视频存储单元、画面切换处理单元、画面放大处理单元和画面追踪处理单元,其中镜头画面捕捉模组与视频存储单元相连,球和球员定位信息获取模组分别与画面切换处理单元和画面追踪处理单元相连,画面切换处理单元、画面放大处理单元和画面追踪处理单元依次相连,画面追踪处理单元与视频存储单元相 连,画面追踪处理单元与视频直播输出接口相连,
    镜头画面捕捉模组对球场进行实时镜头画面捕捉,获取整个球场的同步视频流,并将该视频流进行本地保存,实时保存至视频服务器的本地视频存储单元,
    画面切换处理单元,获取球场的球及球员的球场位置坐标系上定位信息,并将该定位信息传输至视频服务器中,映射成图形画面坐标系上的定位信息,根据球场位置坐标系上定位信息获取球或球员所处的方向角,并根据该方向角生成切换画面指令切换画面指令,根据该切换画面指令切换画面指令切换至局部画面,
    画面放大处理单元,对该局部画面进行放大处理,生成放大画面,
    画面追踪处理单元,根据图形画面坐标系上的定位信息作为画面中心点,对放大画面按第一尺寸标准进行截取,获得截取图像,并将截取图像输出至显示终端。
  7. 根据权利要求6所述的一种多源视频剪辑播放系统,其特征在于:还包括球场事件智能处理单元,所述球场事件智能处理单元输入端与球和球员定位信息获取模组相连,球场事件智能处理单元输出端分别与画面切换处理单元、画面放大处理单元和画面追踪处理单元相连,球场事件智能处理单元还连接有事件日志,所述球场事件智能处理单元对历史赛事的海量数据进行球场事件的机器学习和训练,获得局部战术判断模型。
  8. 根据权利要求7所述的一种多源视频剪辑播放系统,其特征在于:将球场位置坐标系上定位信息映射成图形画面坐标系上的定位信息,其映射方法如下:
    定义球场位置坐标系为xoy坐标系,定义图形画面坐标系为uov坐标系:
    voutput=MinMextvinput=Min(RT)extvinput=Min(RxRzT)extvinput   (1)
    其中:
    vinput=[x,y,z,1]T为足球/球员的在xoy中的位置,z为平行于光轴的维度;
    voutput=[u,v,1]T为足球/球员在uov中的位置;
    Min为从xoy尺度单位到uov尺度单位的缩放矩阵;
    Mext为位置坐标的旋转与平移矩阵;
    R为位置坐标的旋转矩阵;
    RzRx为位置坐标分别绕z,x轴的旋转矩阵;
    T为位置坐标的平移矩阵。
  9. 根据权利要求8所述的一种多源视频剪辑播放系统,其特征在于:根据vinput计算出球或球员在以球场长边中点为圆心,x轴正方向为0度基准线的所处方位角,再根据所处方位角下达切换画面指令切换画面指令到对应方位角朝向的镜头,从而实现局部画面的切换。
  10. 根据权利要求8所述的一种多源视频剪辑播放系统,其特征在于:根据voutput截取放大画面,以(u,v)作为画面中心,截取第一尺寸标准的图像作为截取图像,并将截取图像输出至显示终端进行画面输出。
PCT/CN2017/102172 2017-06-08 2017-09-19 一种多源视频剪辑播放方法及系统 WO2018223554A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710427066.2A CN107147920B (zh) 2017-06-08 2017-06-08 一种多源视频剪辑播放方法及系统
CN201710427066.2 2017-06-08

Publications (1)

Publication Number Publication Date
WO2018223554A1 true WO2018223554A1 (zh) 2018-12-13

Family

ID=59779575

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/102172 WO2018223554A1 (zh) 2017-06-08 2017-09-19 一种多源视频剪辑播放方法及系统

Country Status (2)

Country Link
CN (1) CN107147920B (zh)
WO (1) WO2018223554A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111787341A (zh) * 2020-05-29 2020-10-16 北京京东尚科信息技术有限公司 导播方法、装置及系统
CN113259770A (zh) * 2021-05-11 2021-08-13 北京奇艺世纪科技有限公司 视频播放方法、装置、电子设备、介质及产品
CN113365093A (zh) * 2021-06-07 2021-09-07 广州虎牙科技有限公司 直播方法、装置、系统、电子设备及存储介质
CN113542894A (zh) * 2020-11-25 2021-10-22 腾讯科技(深圳)有限公司 游戏视频剪辑方法、装置、设备及存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107147920B (zh) * 2017-06-08 2019-04-12 简极科技有限公司 一种多源视频剪辑播放方法及系统
CN109165686B (zh) * 2018-08-27 2021-04-23 成都精位科技有限公司 通过机器学习构建球员带球关系的方法、装置及系统
CN111147889B (zh) * 2018-11-06 2022-09-27 阿里巴巴集团控股有限公司 多媒体资源回放方法及装置
CN112399096B (zh) * 2019-08-16 2023-06-23 咪咕文化科技有限公司 一种视频处理方法、设备及计算机可读存储介质
CN111757147B (zh) * 2020-06-03 2022-06-24 苏宁云计算有限公司 一种赛事视频结构化的方法、装置及系统
CN114500773B (zh) * 2021-12-28 2023-10-13 天翼云科技有限公司 一种转播方法、系统和存储介质
CN116781985B (zh) * 2023-08-23 2023-10-20 北京盘腾科技有限公司 一种赛事直播画面的控制方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1476725A (zh) * 2001-07-25 2004-02-18 �ʼҷ����ֵ������޹�˾ 跟踪体育节目中的对象并选出合适摄像机画面的方法及装置
CN101324957A (zh) * 2008-07-16 2008-12-17 上海大学 一种面向移动设备的足球视频智能播放方法
CN101855674A (zh) * 2007-11-07 2010-10-06 汤姆森特许公司 编辑装置、编辑方法、以及编辑程序
CN106488127A (zh) * 2016-11-02 2017-03-08 深圳锐取信息技术股份有限公司 基于足球检测跟踪的摄像机切换控制方法及装置
US20170128814A1 (en) * 2015-11-10 2017-05-11 ShotTracker, Inc. Location and event tracking system for games of sport
US20170154222A1 (en) * 2015-11-26 2017-06-01 Robert Zakaluk System and Method for Identifying, Analyzing, and Reporting on Players in a Game from Video
CN107147920A (zh) * 2017-06-08 2017-09-08 简极科技有限公司 一种多源视频剪辑播放方法及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5513854A (en) * 1993-04-19 1996-05-07 Daver; Gil J. G. System used for real time acquistion of data pertaining to persons in motion
US20150297949A1 (en) * 2007-06-12 2015-10-22 Intheplay, Inc. Automatic sports broadcasting system
CN101753852A (zh) * 2008-12-15 2010-06-23 姚劲草 基于目标检测与跟踪的体育比赛动态微型地图
CN102347043B (zh) * 2010-07-30 2014-10-22 腾讯科技(北京)有限公司 多角度视频播放方法和系统
AU2015222869B2 (en) * 2014-02-28 2019-07-11 Genius Sports Ss, Llc System and method for performing spatio-temporal analysis of sporting events
CN106606857A (zh) * 2016-02-29 2017-05-03 简极科技有限公司 一种基于定位的足球比赛技术统计方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1476725A (zh) * 2001-07-25 2004-02-18 �ʼҷ����ֵ������޹�˾ 跟踪体育节目中的对象并选出合适摄像机画面的方法及装置
CN101855674A (zh) * 2007-11-07 2010-10-06 汤姆森特许公司 编辑装置、编辑方法、以及编辑程序
CN101324957A (zh) * 2008-07-16 2008-12-17 上海大学 一种面向移动设备的足球视频智能播放方法
US20170128814A1 (en) * 2015-11-10 2017-05-11 ShotTracker, Inc. Location and event tracking system for games of sport
US20170154222A1 (en) * 2015-11-26 2017-06-01 Robert Zakaluk System and Method for Identifying, Analyzing, and Reporting on Players in a Game from Video
CN106488127A (zh) * 2016-11-02 2017-03-08 深圳锐取信息技术股份有限公司 基于足球检测跟踪的摄像机切换控制方法及装置
CN107147920A (zh) * 2017-06-08 2017-09-08 简极科技有限公司 一种多源视频剪辑播放方法及系统

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111787341A (zh) * 2020-05-29 2020-10-16 北京京东尚科信息技术有限公司 导播方法、装置及系统
CN111787341B (zh) * 2020-05-29 2023-12-05 北京京东尚科信息技术有限公司 导播方法、装置及系统
CN113542894A (zh) * 2020-11-25 2021-10-22 腾讯科技(深圳)有限公司 游戏视频剪辑方法、装置、设备及存储介质
CN113259770A (zh) * 2021-05-11 2021-08-13 北京奇艺世纪科技有限公司 视频播放方法、装置、电子设备、介质及产品
CN113259770B (zh) * 2021-05-11 2022-11-18 北京奇艺世纪科技有限公司 视频播放方法、装置、电子设备、介质及产品
CN113365093A (zh) * 2021-06-07 2021-09-07 广州虎牙科技有限公司 直播方法、装置、系统、电子设备及存储介质
CN113365093B (zh) * 2021-06-07 2022-09-06 广州虎牙科技有限公司 直播方法、装置、系统、电子设备及存储介质

Also Published As

Publication number Publication date
CN107147920A (zh) 2017-09-08
CN107147920B (zh) 2019-04-12

Similar Documents

Publication Publication Date Title
WO2018223554A1 (zh) 一种多源视频剪辑播放方法及系统
US10771760B2 (en) Information processing device, control method of information processing device, and storage medium
Halvorsen et al. Bagadus: an integrated system for arena sports analytics: a soccer case study
US11551428B2 (en) Methods and apparatus to generate photo-realistic three-dimensional models of a photographed environment
CN110544301A (zh) 一种三维人体动作重建系统、方法和动作训练系统
CN109982054B (zh) 一种基于定位追踪的投影方法、装置、投影仪及投影系统
CN109241956B (zh) 合成图像的方法、装置、终端及存储介质
US20070064975A1 (en) Moving object measuring apparatus, moving object measuring system, and moving object measurement
JP2009505553A (ja) ビデオストリームへの視覚効果の挿入を管理するためのシステムおよび方法
US9154710B2 (en) Automatic camera identification from a multi-camera video stream
US9087380B2 (en) Method and system for creating event data and making same available to be served
CN110270078B (zh) 足球比赛特效展示系统、方法及计算机装置
El-Saban et al. Improved optimal seam selection blending for fast video stitching of videos captured from freely moving devices
WO2021017496A1 (zh) 导播方法、装置及计算机可读存储介质
JP6077425B2 (ja) 映像管理装置及びプログラム
Zhang et al. Robust multi-view multi-camera face detection inside smart rooms using spatio-temporal dynamic programming
JP6602726B2 (ja) 仮想環境生成装置、仮想環境生成方法、プログラム
CN111556338B (zh) 视频中区域的检测方法、信息融合方法、装置和存储介质
CA2633197A1 (en) Method and system for creating event data and making same available to be served
CN114092706A (zh) 一种体育全景足球录像方法、系统、存储介质及终端设备
Pham et al. A low cost system for 3d motion analysis using Microsoft Kinect
Liang et al. Video2Cartoon: Generating 3D cartoon from broadcast soccer video
CN111754543A (zh) 图像处理方法、装置及系统
US20230334767A1 (en) Image processing apparatus, image processing method, and storage medium
JP2018055643A (ja) 画像処理装置、画像処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17912554

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.06.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17912554

Country of ref document: EP

Kind code of ref document: A1