WO2013026991A1 - Improvements in automatic video production - Google Patents

Improvements in automatic video production Download PDF

Info

Publication number
WO2013026991A1
WO2013026991A1 PCT/GB2011/001264 GB2011001264W WO2013026991A1 WO 2013026991 A1 WO2013026991 A1 WO 2013026991A1 GB 2011001264 W GB2011001264 W GB 2011001264W WO 2013026991 A1 WO2013026991 A1 WO 2013026991A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
play
video recording
triangulation
editing system
Prior art date
Application number
PCT/GB2011/001264
Other languages
French (fr)
Inventor
David John THOMAS
Original Assignee
Thomas David John
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomas David John filed Critical Thomas David John
Priority to PCT/GB2011/001264 priority Critical patent/WO2013026991A1/en
Publication of WO2013026991A1 publication Critical patent/WO2013026991A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0007Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • This invention relates to the automatic production of a video from multiple sources.
  • the method provides for the use of at least two and preferably more mobile recording devices to record a subject or scene such as a sports playing field e.g. football, soccer, athletics or rugby or a sports course such as golf or rowing where, the relative locations of the mobile recording devices with respect to the subject or scene and each other is initially unknown and is subsequently determined from the field of view of each device as it tracks the subject/scene.
  • This information then provides a set of boundaries for the device with respect to the subject/scene and the other devices enabling an automated editing system to determine, using triangulation which of the mobile recording devices has the best image at a given time of the subject/scene. This then enables a complete recording of the subject/scene where the best available image is provided automatically.
  • a system for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices whereby a means is provided to perform triangulation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
  • An automatic editing system may be provided and the automatic editing system carries out the triangulation process.
  • the data from the at least two mobile video recording devices is used in the triangulation process.
  • the data may include one or more of video footage; GPS; location; compass bearing; inclination; and zoom settings.
  • the automated editing system may store the boundaries of each mobile video recording device. In other words, over a period of time the automated editing system may determine the maximum extent of the field of movement of the subjects or objects being filmed.
  • the automated editing system determines the maximum extent of the field of movement of the subjects or objects being filmed.
  • the boundaries of each mobile video recording device is related to the maximum extent of the field of movement of the subjects or objects being filmed i.e. a playing field with fixed boundaries or a course with boundaries which may be fixed e.g. a golf course but have more than one field of play i.e. each hole is considered as a separate field of play. This allows the automated editing system to switch between mobile video recording devices as the area or region of interest changes over time e.g. as a ball moves from one end of a pitch to the other.
  • the method uses triangulation to determine the position of unknown mobile video recording devices and/or such devices whose positions are unknown, for example cameras (such as mobile phones of spectators at a football match).
  • the location/behaviour of the subject or scene and the location of the unknown camera relative to the field of play of the subject are determined from this information.
  • the auto editing system then puts together the best, closest or most representative images according to the locations of the unknown cameras relative to the fields of play.
  • Reverse triangulation may be used to determine the position of each mobile video recording device relative to the field of movement of the subjects or objects being filmed.
  • the automated editing system determines which camera represents the best or closest or most representative view of the current play based on the location of the subject or object in the field of play. If boundaries for each device have been stored this information can be used to make this decision. Preferably, the automated editing system determines that the video output from this camera is selected to be displayed or recorded as the output choice for a given moment.
  • the automated editing system may determine each selected camera at each given instant over the complete range of footage for the entirety of the period of play.
  • the automated editing system may be able to automatically edit footage for an entire period of play without operator control or intervention.
  • the automated editing system monitors the location of the subject or object during play and determines whether to maintain or change the selected camera footage as the best, closest or most representative view of the current subject or object being filmed.
  • the automated editing system may additionally compile an audio commentary taken from a library of commentaries and using the known outcome of an event during a period of play.
  • the invention provides a method of compiling an audio commentary for video footage comprising:
  • This method of making the commentary is particularly useful when multiple sources of footage might be used some of which may not have audio feedback and in cases where they do have audio recording, the actual audio recording may be inappropriate to use.
  • a computer-usable medium for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices, the computer-usable medium embodying computer program code, the computer program code comprising computer executable instructions configured to perform trianguiation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
  • Figure 1 is a perspective view of the invention at time A;
  • Figure 2 is a perspective view of the invention at time B;
  • Figure 3 is a perspective view of the invention showing the edge of playing area
  • Figure 4 is a perspective view of the invention showing a second edge of playing area
  • Figure 5 is a perspective view of the invention showing a third edge of playing area
  • Figure 6 is a perspective view of the invention showing a fourth edge of playing area
  • Figure 7 is a perspective view showing position of a camera relative to subjects at different locations using reverse trianguiation
  • Figure 8 shows a perspective view of an automated decision choosing the best camera view for a subject
  • Figure 9 is a flow diagram detailing steps used in the invention.
  • Figure 10 is a flow chart according to the invention.
  • Figure 11 is a schematic diagram of a data processing system in which the present invention may be embodied
  • Figure 12 is a schematic diagram of a software system for carrying out the present invention.
  • Figure 13 is a schematic diagram of a network of data processing systems in which aspects of the present invention may be implemented.
  • the invention provides for the collation of video footage from at least two sources having unknown position/location with respect to a region of area of interest for example, a sports playing field, along with data on GPS, compass bearing, inclination, zoom settings.
  • This all provides data that an automated editing decision making system 250 analyses to provide position information of the devices acquiring footage relative to a field of play of a game or sport e.g. a sports playing field ( Figure 1 & 2).
  • a field of play 10 marks the boundary of the area of interest 12 and ideally there will be sufficient unknown mobile recording devices covering enough of the area of interest 12 to provide coverage of the event.
  • the specific area or region of interest 30 moves over time and is shown in Figure 1 at time stamp A and in Figure 2 at time stamp B.
  • Lines 22a and 22b show the line of sight of cameras 20a and 20b respectively at each time stamp.
  • the system analyses information about one or more of GPS, compass bearing, inclination, zoom settings, over a period of time in which the subject achieves the maximum extent of their field of play i.e. the range or viewing field of the source in question ( Figures 3, 4, 5 & 6).
  • the system then takes a number of different instances where the subject has moved position and uses triangulation to determine the position of each source 20a, 20b or camera relative to the defined field of play 10 (Figure 7).
  • the automated editing device 250 receives data from each source 20a, 20b and over time determines or forms a picture of the boundaries of each device 20a,20b as the specific area of interest moves over time. For each camera 20a,20b (20b shown) triangulation is used to determine the position of the camera 20b relative to the specific area of interest or subjects 130,130' at two different locations at different times
  • the automated editing decision making system 250 determines which camera provides the best representation of the play at that moment or time frame.
  • the automated editing decision making system 250 switches the output of a video source to be that particular output at that particular moment i.e. the camera that provides the best representation.
  • the automated editing decision making system maintains that particular choice as the output until such time as the position of the subject or object being filmed changes to another position.
  • the automated editing decision making system determines whether to maintain the same camera as the best closest or most representative source, or to switch to an alternative source based on the above process.
  • the system preferably also analyses the data set to provide time based information for the relative position of something of interest e.g. a ball in the field of play.
  • the system then makes a decision as to which source or camera provides the closest and best representative view of the play, based on the location of the subject and the location of the source or camera relative to the field of play and the subject of interest ( Figure 8).
  • the information can be compared to the boundary 10 to provide zones in which each camera has the best image and the automated editing device 250 can determine which of the cameras provides the best view and chose the data or picture of that camera.
  • the automated editing device 250 is any data processing system suitably configured to enable implementation of the processes and apparatus of the embodiments.
  • auto video editing device will be described in the general context of computer-executable instructions, such as program modules, being executed on a single computer.
  • computer-executable instructions such as program modules
  • the auto video editing devices and methods may be practiced with other computer systems including multi-processor systems, microprocessor based systems, programmable consumer electronics, networked PCs, mini computers, main frame computers, handheld devices and the like.
  • FIG. 1 a computer based data processing system 1100, in which the automated editing device 250 is implemented according to one embodiment is illustrated in FIG. 1 1.
  • Data processing system 1 100 has a processor (central processor unit (CPU)) 1101. Operably coupled to processor 1 101 , via one or more data buses, are a Random Access Memory 1102 and a storage unit 1103.
  • Input Units 1 104,1105 are configured to input data into processor 1 101 and Output Units 1 106 are configured to output the processed data. Inputs can be entered from a keyboard, pointing device, USB stick, appropriate data connection of other suitable input.
  • CPU central processor unit
  • the input units are a pointing device 1104, such as a mouse or touch screen pointer, and a text input device 05, such as a keyboard or touch screen keys.
  • Input can also be downloaded or fed from one or more networks via a network interface 1107.
  • inputs can be downloaded from the internet via a communication device.
  • Processor 1101 is configured to perform calculations, make decisions and control units of the data processing system.
  • Input units and network interface accept data and instructions and input this information in a useable form to the data processing for processing.
  • RAM 1102 and storage unit 1 103 stores data and instructions input to the data processing system for and during processing by the processor and for future use.
  • RAM 102 is utilized, but not limited, to holding a program being executed by the processor, and related data.
  • Storage unit 1 103 is utilized but not limited to archive programs, documents, databases and data results etc. Non-limiting examples of storage devices are hard disk, USB stick, DVD, CD etc.
  • Output Units output the results of the data processed by the processing system.
  • output units are a display or monitor 1106 for visually displaying the output data.
  • Other types of possible output units are for example a USB stick or output cable.
  • Output data is also uploadable to a network via the network interface 1 107 via a communication device.
  • data-processing apparatus 1 100 is not limited to the specific data system of FIG. 11 and may be in some embodiments a mobile computing device such as a Smartphone, a laptop computer, iPhone, or tablet device etc. In other embodiments, data-processing apparatus 1100 may function as a desktop computer, server, and the like, depending upon design considerations.
  • FIG. 12 there is illustrated a computer software system 1200 for controlling data processing system 1 100 of FIG. 1 1 to perform auto video editing operations.
  • Software system is for example stored in RAM 1 02 and Storage Unit 1 103 of FIG. 1 1 .
  • An operating system 1201 is configured to control operation of components of the data processing system.
  • One or more application software program modules are available for execution by the data processing system 1 100.
  • Module refers to a simple application or to groupings of routines, programs, objects, components and/or data structures for performing one or more particular functions. Modules may be composed of an interface part and routines accessible by other modules.
  • an auto video editing software application 1202 includes instructions for performing operations described herein in relation processes of the embodiments.
  • Software 1202 may include one or modules.
  • software 1202 has a data collector module 1203 for collecting data from the mobile devices, a position determinator 1204 for determining the position of the mobile device with respect to the field of play from the collected data, video editor 1205 for determining and selecting the mobile device giving the best or most desired view and automatically generating video from the best or most desired views, and an audio commentator 1206 for generating and adding commentary to the generated video.
  • Interface 1203 is for receiving and inputting user instructions and data into the data processing system.
  • Interface 1203 may be a user graphical interface formed for example from text input device, pointing input device and display of system FIG. 11. Alternatively or additionally, interface 1203 may be network interface 1 107.
  • Operating system module and/or automated video editing modules control the data processing system to act upon inputs from the interface(s).
  • Operating system 1201 may in one embodiment be a Mac operating system. It can be understood that other types of operating system can be adopted, such as Microsoft, Linux, Android, iOS or other operating system. It will be appreciate that once the data processing system has been pre-configured for auto video editing from particular mobile cameras, the auto video editing software application can run by itself without further user interface inputs to automatically auto edit video from the mobile cameras.
  • FIG. 13 illustrates such a network of data processing systems.
  • Network data processing system 1300 has a plurality of the mobile devices 20a, 20b for capturing video that are operably connectable via one or more networks 1301 to one or more servers 1302 and one or more clients 1303.
  • Data processing system 1100 of FIG. 1 1 is implemented as client 1303 or as server 1302 depending on the particular application.
  • Network(s) 1301 are in this example a telecommunication network and internet network for connecting the mobile device via a telecommunication network to one or more server(s) 1302 and for connecting the one or more servers 1302 to the client(s) 1303.
  • the network(s) can be intranet networks or a combination or both internet and intranet networks with or without a telecommunication network.
  • a number of different types of networks can be utilized such as for example, local area network (LAN), a wide area network (WAN) or a private virtual network (VPN).
  • LAN local area network
  • WAN wide area network
  • VPN private virtual network
  • the data processing system 1100 implemented in a client or server can receive data from the mobile devices over wifi link either directly or via a network without reliance on a cellular telecommunication connection.
  • Network data processing system 1300 may include additional servers, clients and other systems and devices not shown in FIG. 13. The computation described herein may be executed on one or a plurality of servers and information communicated over network(s) 1301 to client(s) 1303 or other devices. Network data processing system 1300 may also include storage or databases for storing data such the video images or mobile device data and/or audio commentary related library data for use by the auto video editing software running on the client or server.
  • Figure 9 is a flow diagram showing steps used in the invention.
  • data is collected from multiple independent sources 200.
  • This data includes video footage, GPS data, compass bearing data, inclination data and zoom settings from the source.
  • the data 200 from each source is analysed 210 to provide position information for each source and this analysed data 210 is used to determine the extent of the field of play for each source 220.
  • Data from a source is triangulated 230 to determine the position of the source with respect to the field of play of the event. Based on location of a subject of the event 240 a decision is made by the automatic editing system 250 on which source provides the best view of play.
  • one aspect of the invention resides in the ability to use mobile cameras at unknown positions to track the behaviour subject or scene and use this information to put together the video without having to "edit” the video in the conventional sense.
  • Auto commentary can also be added. This results in a complete automatically edited audio video of the subject of scene that does not require conventional post editing so that local football matches etc. can still be covered despite the absence of conventional TV and video broadcasters etc.
  • An additional aspect of the invention is to provide an audio commentary to the video footage whereby, modulation of the emotive pitch and intensity of the commentary is preferably taken from a library based commentary derived from the known outcome of any particular action or play in an event, with the purpose of communicating the assumed tension that a commentator would naturally impart from being present at the event.
  • the commentary sounds human rather than an automated/robotic sounding synthesised voice.
  • commentary can be made from what is recorded by each device and/or an independent commentator who gives an audio story whilst watching the video footage.
  • Figure 10 shows a flow chart according to the invention where the automated editing system 350 receives data 360, 362 from different sources and determines which data set gives the best view 370 i.e. which camera represents the best or closest or most representative view of the current play based on the location of the subject or object in the field of play. If boundaries 380 for each device have been stored this information can be used to make this decision.
  • the automated editing system 350 determines that the video output from this camera is selected 390 to be displayed or recorded as the output choice for a given moment.
  • the automated editing system 350 determines 400 each selected camera at each given instant over the complete range of footage for the entirety of the period of play by constantly checking the data received about where play is occurring.
  • the automated editing system 350 is able to automatically edit footage for an entire period of play without operator control or
  • the automated editing system 350 achieves this by monitoring the location of the subject or object 410 during play and determines whether to maintain or change the selected camera footage 370 as the best, closest or most representative view of the current subject or object being filmed.
  • the source can be any of a number of possible sources which can both take a video recording and substantially simultaneously transmit the data along with positional information to an analyser including but not limited to a mobile phone.
  • the automated editing system may be part of a video sharing web stream associated with a suitable communications system such as but not limited to the internet or another a wireless communications network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A method for automatically producing a video of a subject (30) based on images from at least two mobile video recording devices (20a, 20b) whereby triangulation is used to determine the relative positions of the devices with respect to the subject or scene (30). An automatic editing system may carry out the triangulation or reverse triangulation process and may store boundaries for the devices; data (200) from the devices (20a, 20b) may be used in the triangulation process and may include one or more of video footage; GPS; location; compass bearing; inclination; and zoom settings. Maximum extent of the field of movement (10) of the subjects or objects being filmed may be determined which camera (20a, 20b) represents the best or closest or most representative view of the current play based on the location of the subject or object (30) in the field of play (10).

Description

IMPROVEMENTS IN AUTOMATIC VIDEO PRODUCTION
Field of the Invention
This invention relates to the automatic production of a video from multiple sources.
Background of the Invention
Many events such as sporting events take place without an official video recording of the event taking place. In such circumstances an alternative to a video recording is a radio broadcast recorded during the event or the use of sports writers to produce a concise written record of what transpired during the event. For people who are unable to be present at the event in person these alternative records are not particularly satisfying.
Summary of the Invention
According to one aspect of the present invention, there is provided a method for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices whereby triangulation is used to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
The method provides for the use of at least two and preferably more mobile recording devices to record a subject or scene such as a sports playing field e.g. football, soccer, athletics or rugby or a sports course such as golf or rowing where, the relative locations of the mobile recording devices with respect to the subject or scene and each other is initially unknown and is subsequently determined from the field of view of each device as it tracks the subject/scene. This information then provides a set of boundaries for the device with respect to the subject/scene and the other devices enabling an automated editing system to determine, using triangulation which of the mobile recording devices has the best image at a given time of the subject/scene. This then enables a complete recording of the subject/scene where the best available image is provided automatically.
According to another aspect of the present invention, there is provided a system for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices whereby a means is provided to perform triangulation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
An automatic editing system may be provided and the automatic editing system carries out the triangulation process.
In one non-limiting example, the data from the at least two mobile video recording devices is used in the triangulation process.
The data may include one or more of video footage; GPS; location; compass bearing; inclination; and zoom settings.
The automated editing system may store the boundaries of each mobile video recording device. In other words, over a period of time the automated editing system may determine the maximum extent of the field of movement of the subjects or objects being filmed.
In one non-limiting example, over a period of time the automated editing system determines the maximum extent of the field of movement of the subjects or objects being filmed. In one embodiment the boundaries of each mobile video recording device is related to the maximum extent of the field of movement of the subjects or objects being filmed i.e. a playing field with fixed boundaries or a course with boundaries which may be fixed e.g. a golf course but have more than one field of play i.e. each hole is considered as a separate field of play. This allows the automated editing system to switch between mobile video recording devices as the area or region of interest changes over time e.g. as a ball moves from one end of a pitch to the other.
In one non-limiting example, the method uses triangulation to determine the position of unknown mobile video recording devices and/or such devices whose positions are unknown, for example cameras (such as mobile phones of spectators at a football match). The location/behaviour of the subject or scene and the location of the unknown camera relative to the field of play of the subject are determined from this information. The auto editing system then puts together the best, closest or most representative images according to the locations of the unknown cameras relative to the fields of play.
Reverse triangulation may be used to determine the position of each mobile video recording device relative to the field of movement of the subjects or objects being filmed.
In one non-limiting example, the automated editing system determines which camera represents the best or closest or most representative view of the current play based on the location of the subject or object in the field of play. If boundaries for each device have been stored this information can be used to make this decision. Preferably, the automated editing system determines that the video output from this camera is selected to be displayed or recorded as the output choice for a given moment.
The automated editing system may determine each selected camera at each given instant over the complete range of footage for the entirety of the period of play.
The automated editing system may be able to automatically edit footage for an entire period of play without operator control or intervention.
In one non-limiting example, the automated editing system monitors the location of the subject or object during play and determines whether to maintain or change the selected camera footage as the best, closest or most representative view of the current subject or object being filmed.
The automated editing system may additionally compile an audio commentary taken from a library of commentaries and using the known outcome of an event during a period of play.
According to a second aspect, the invention provides a method of compiling an audio commentary for video footage comprising:
accessing a library of commentaries; and
using the known outcome of an event during a period of play, compiling the audio commentary.
This method of making the commentary is particularly useful when multiple sources of footage might be used some of which may not have audio feedback and in cases where they do have audio recording, the actual audio recording may be inappropriate to use.
According to yet another aspect, there is provided a computer-usable medium for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices, the computer-usable medium embodying computer program code, the computer program code comprising computer executable instructions configured to perform trianguiation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
The invention will now be described, by way of example only, with reference to the accompanying drawings, in which:-
Brief Description of the Drawings
Figure 1 is a perspective view of the invention at time A;
Figure 2 is a perspective view of the invention at time B;
Figure 3 is a perspective view of the invention showing the edge of playing area;
Figure 4 is a perspective view of the invention showing a second edge of playing area;
Figure 5 is a perspective view of the invention showing a third edge of playing area;
Figure 6 is a perspective view of the invention showing a fourth edge of playing area;
Figure 7 is a perspective view showing position of a camera relative to subjects at different locations using reverse trianguiation;
Figure 8 shows a perspective view of an automated decision choosing the best camera view for a subject;
Figure 9 is a flow diagram detailing steps used in the invention; Figure 10 is a flow chart according to the invention;
Figure 11 is a schematic diagram of a data processing system in which the present invention may be embodied;
Figure 12 is a schematic diagram of a software system for carrying out the present invention; and
Figure 13 is a schematic diagram of a network of data processing systems in which aspects of the present invention may be implemented.
Detailed Description of the illustrated embodiment
The invention provides for the collation of video footage from at least two sources having unknown position/location with respect to a region of area of interest for example, a sports playing field, along with data on GPS, compass bearing, inclination, zoom settings. This all provides data that an automated editing decision making system 250 analyses to provide position information of the devices acquiring footage relative to a field of play of a game or sport e.g. a sports playing field (Figure 1 & 2).
Referring to Figures 1 and 2, a field of play 10 marks the boundary of the area of interest 12 and ideally there will be sufficient unknown mobile recording devices covering enough of the area of interest 12 to provide coverage of the event. There are two mobile video recording devices (cameras) 20a, 20b which are positioned on one side 14 of the boundary 0 and spaced apart. The specific area or region of interest 30 moves over time and is shown in Figure 1 at time stamp A and in Figure 2 at time stamp B. Lines 22a and 22b show the line of sight of cameras 20a and 20b respectively at each time stamp. The system analyses information about one or more of GPS, compass bearing, inclination, zoom settings, over a period of time in which the subject achieves the maximum extent of their field of play i.e. the range or viewing field of the source in question (Figures 3, 4, 5 & 6).
Thus, as the region of interest 30 moves around the field of play 10 information or data is collected about the footage from each source 20a, 20b. In some circumstances, for example where the area of interest 30 is remote from the sources 20a, 20b as shown in Figures 3 and 5, both sources 20a, 20b will give a reasonable representation of what is happening. However, if the area of interest is at a boundary 14 close to the source 20a,20b but at the distil end of that boundary, the footage taken by a remote source (20b in Figure 4 and source 20a in Figure 6) will not have a clear view and this may be judged to be outside of that sources' field of view by the automated editing device.
The system then takes a number of different instances where the subject has moved position and uses triangulation to determine the position of each source 20a, 20b or camera relative to the defined field of play 10 (Figure 7).
So, at the start of play the automated editing device 250 receives data from each source 20a, 20b and over time determines or forms a picture of the boundaries of each device 20a,20b as the specific area of interest moves over time. For each camera 20a,20b (20b shown) triangulation is used to determine the position of the camera 20b relative to the specific area of interest or subjects 130,130' at two different locations at different times
The automated editing decision making system 250 determines which camera provides the best representation of the play at that moment or time frame. The automated editing decision making system 250 switches the output of a video source to be that particular output at that particular moment i.e. the camera that provides the best representation. The automated editing decision making system maintains that particular choice as the output until such time as the position of the subject or object being filmed changes to another position. The automated editing decision making system then determines whether to maintain the same camera as the best closest or most representative source, or to switch to an alternative source based on the above process.
The system preferably also analyses the data set to provide time based information for the relative position of something of interest e.g. a ball in the field of play. The system then makes a decision as to which source or camera provides the closest and best representative view of the play, based on the location of the subject and the location of the source or camera relative to the field of play and the subject of interest (Figure 8). Thus, the information can be compared to the boundary 10 to provide zones in which each camera has the best image and the automated editing device 250 can determine which of the cameras provides the best view and chose the data or picture of that camera.
It will be understood that the invention may be employed as an apparatus and/or a system.
The automated editing device 250 is any data processing system suitably configured to enable implementation of the processes and apparatus of the embodiments.
A general description of suitable computer environments in which embodiments of the present invention may be implemented will now follow. Although not required, the auto video editing device will be described in the general context of computer-executable instructions, such as program modules, being executed on a single computer. One skilled in the art would appreciate that the auto video editing devices and methods may be practiced with other computer systems including multi-processor systems, microprocessor based systems, programmable consumer electronics, networked PCs, mini computers, main frame computers, handheld devices and the like.
By way of example, a computer based data processing system 1100, in which the automated editing device 250 is implemented according to one embodiment is illustrated in FIG. 1 1. Data processing system 1 100 has a processor (central processor unit (CPU)) 1101. Operably coupled to processor 1 101 , via one or more data buses, are a Random Access Memory 1102 and a storage unit 1103. Input Units 1 104,1105 are configured to input data into processor 1 101 and Output Units 1 106 are configured to output the processed data. Inputs can be entered from a keyboard, pointing device, USB stick, appropriate data connection of other suitable input. In the example of FIG. 11 , the input units are a pointing device 1104, such as a mouse or touch screen pointer, and a text input device 05, such as a keyboard or touch screen keys. Input can also be downloaded or fed from one or more networks via a network interface 1107. For example, inputs can be downloaded from the internet via a communication device.
Processor 1101 is configured to perform calculations, make decisions and control units of the data processing system. Input units and network interface accept data and instructions and input this information in a useable form to the data processing for processing. RAM 1102 and storage unit 1 103 stores data and instructions input to the data processing system for and during processing by the processor and for future use. RAM 102 is utilized, but not limited, to holding a program being executed by the processor, and related data. Storage unit 1 103 is utilized but not limited to archive programs, documents, databases and data results etc. Non-limiting examples of storage devices are hard disk, USB stick, DVD, CD etc.
Output Units output the results of the data processed by the processing system. Typically, output units are a display or monitor 1106 for visually displaying the output data. Other types of possible output units are for example a USB stick or output cable. Output data is also uploadable to a network via the network interface 1 107 via a communication device.
It can be appreciated that the data-processing apparatus 1 100 is not limited to the specific data system of FIG. 11 and may be in some embodiments a mobile computing device such as a Smartphone, a laptop computer, iPhone, or tablet device etc. In other embodiments, data-processing apparatus 1100 may function as a desktop computer, server, and the like, depending upon design considerations.
Referring now to FIG. 12, there is illustrated a computer software system 1200 for controlling data processing system 1 100 of FIG. 1 1 to perform auto video editing operations. Software system is for example stored in RAM 1 02 and Storage Unit 1 103 of FIG. 1 1 . An operating system 1201 is configured to control operation of components of the data processing system. One or more application software program modules are available for execution by the data processing system 1 100. "Module" as defined herein refers to a simple application or to groupings of routines, programs, objects, components and/or data structures for performing one or more particular functions. Modules may be composed of an interface part and routines accessible by other modules.
In software system 1200 of FIG. 12, an auto video editing software application 1202 includes instructions for performing operations described herein in relation processes of the embodiments. Software 1202 may include one or modules. For example, in particular, software 1202 has a data collector module 1203 for collecting data from the mobile devices, a position determinator 1204 for determining the position of the mobile device with respect to the field of play from the collected data, video editor 1205 for determining and selecting the mobile device giving the best or most desired view and automatically generating video from the best or most desired views, and an audio commentator 1206 for generating and adding commentary to the generated video.
Interface 1203 is for receiving and inputting user instructions and data into the data processing system. Interface 1203 may be a user graphical interface formed for example from text input device, pointing input device and display of system FIG. 11. Alternatively or additionally, interface 1203 may be network interface 1 107. Operating system module and/or automated video editing modules control the data processing system to act upon inputs from the interface(s). Operating system 1201 may in one embodiment be a Mac operating system. It can be understood that other types of operating system can be adopted, such as Microsoft, Linux, Android, iOS or other operating system. It will be appreciate that once the data processing system has been pre-configured for auto video editing from particular mobile cameras, the auto video editing software application can run by itself without further user interface inputs to automatically auto edit video from the mobile cameras.
In one example, the present invention is embodied in a network of data processing systems. By way of example, FIG. 13 illustrates such a network of data processing systems. Network data processing system 1300 has a plurality of the mobile devices 20a, 20b for capturing video that are operably connectable via one or more networks 1301 to one or more servers 1302 and one or more clients 1303. Data processing system 1100 of FIG. 1 1 is implemented as client 1303 or as server 1302 depending on the particular application. Network(s) 1301 are in this example a telecommunication network and internet network for connecting the mobile device via a telecommunication network to one or more server(s) 1302 and for connecting the one or more servers 1302 to the client(s) 1303. In other examples, the network(s) can be intranet networks or a combination or both internet and intranet networks with or without a telecommunication network. A number of different types of networks can be utilized such as for example, local area network (LAN), a wide area network (WAN) or a private virtual network (VPN). In one example, the data processing system 1100 implemented in a client or server can receive data from the mobile devices over wifi link either directly or via a network without reliance on a cellular telecommunication connection.
Network data processing system 1300 may include additional servers, clients and other systems and devices not shown in FIG. 13. The computation described herein may be executed on one or a plurality of servers and information communicated over network(s) 1301 to client(s) 1303 or other devices. Network data processing system 1300 may also include storage or databases for storing data such the video images or mobile device data and/or audio commentary related library data for use by the auto video editing software running on the client or server.
Figure 9 is a flow diagram showing steps used in the invention. Firstly, data is collected from multiple independent sources 200. This data includes video footage, GPS data, compass bearing data, inclination data and zoom settings from the source. The data 200 from each source is analysed 210 to provide position information for each source and this analysed data 210 is used to determine the extent of the field of play for each source 220. Data from a source is triangulated 230 to determine the position of the source with respect to the field of play of the event. Based on location of a subject of the event 240 a decision is made by the automatic editing system 250 on which source provides the best view of play.
Therefore, one aspect of the invention resides in the ability to use mobile cameras at unknown positions to track the behaviour subject or scene and use this information to put together the video without having to "edit" the video in the conventional sense. Auto commentary can also be added. This results in a complete automatically edited audio video of the subject of scene that does not require conventional post editing so that local football matches etc. can still be covered despite the absence of conventional TV and video broadcasters etc. An additional aspect of the invention is to provide an audio commentary to the video footage whereby, modulation of the emotive pitch and intensity of the commentary is preferably taken from a library based commentary derived from the known outcome of any particular action or play in an event, with the purpose of communicating the assumed tension that a commentator would naturally impart from being present at the event. Thus the commentary sounds human rather than an automated/robotic sounding synthesised voice.
Alternatively, commentary can be made from what is recorded by each device and/or an independent commentator who gives an audio story whilst watching the video footage.
Figure 10 shows a flow chart according to the invention where the automated editing system 350 receives data 360, 362 from different sources and determines which data set gives the best view 370 i.e. which camera represents the best or closest or most representative view of the current play based on the location of the subject or object in the field of play. If boundaries 380 for each device have been stored this information can be used to make this decision. The automated editing system 350 determines that the video output from this camera is selected 390 to be displayed or recorded as the output choice for a given moment.
The automated editing system 350 determines 400 each selected camera at each given instant over the complete range of footage for the entirety of the period of play by constantly checking the data received about where play is occurring.
In one embodiment, the automated editing system 350 is able to automatically edit footage for an entire period of play without operator control or
intervention. The automated editing system 350 achieves this by monitoring the location of the subject or object 410 during play and determines whether to maintain or change the selected camera footage 370 as the best, closest or most representative view of the current subject or object being filmed.
The source can be any of a number of possible sources which can both take a video recording and substantially simultaneously transmit the data along with positional information to an analyser including but not limited to a mobile phone.
The automated editing system may be part of a video sharing web stream associated with a suitable communications system such as but not limited to the internet or another a wireless communications network.
It is to be appreciated that these Figures are for illustration purposes only and other configurations are possible.
The invention has been described by way of several embodiments, with modifications and alternatives, but having read and understood this description further embodiments and modifications will be apparent to those skilled in the art. All such embodiments and modifications are intended to fall within the scope of the present invention as defined in the accompanying claims.

Claims

Claims
1. A method for automatically producing a video of a subject or scene (30) based on images from at least two mobile video recording devices (20a, 20b) whereby triangulation is used to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene (30).
2. A method according to claim 1 wherein, an automatic editing system is provided and the automatic editing system (250) carries out the triangulation process.
3. A method according to claim 1 or claim 2 wherein, data (200) from the at least two mobile video recording devices (20a,20b) is used in the
triangulation process.
4. A method according to claim 3 wherein, the data (200) includes one or more of video footage; GPS; location; compass bearing; inclination; and zoom settings.
5. A method according to any of claims 2 to 4 wherein, the automatic editing system (250) stores boundaries for each of the at least two mobile video recording devices (20a,20b).
6. A method according to any of claims 2 to 5 wherein, over a period of time the automated editing system (250) determines the maximum extent of the field of movement (10) of the subjects or objects being filmed.
7. A method according to any preceding claim wherein, reverse
triangulation is used to determine the position of each mobile video recording device (20a,20b) relative to the field of movement (10) of the subjects or objects being filmed.
8. A method according to any of claims 2 to 7 wherein, the automated editing system (250) determines which camera (20a,20b) represents the best or closest or most representative view of the current play based on the location of the subject or object (30) in the field of play (10).
9. A method according to claim 8 wherein, the automated editing system (250) determines that the video output from this camera (20a,20b) is selected to be displayed or recorded as the output choice for a given moment.
10. A method according to claim 8 or claim 9 wherein, the automated editing system (250) determines each selected camera (20a, 20b) at each given instant over the complete range of footage for the entirety of the period of play.
11. A method according to any of claims 2 to 10 wherein, the automated editing system (250) is able to automatically edit footage for an entire period of play without operator control or intervention.
12. A method according to claim 8 or claim 9 wherein, the automated editing system (250) monitors the location of the subject or object (30) during play and determines whether to maintain or change the selected camera (20a,20b) footage as the best, closest or most representative view of the current subject or object being filmed.
13. A method according to any preceding claim wherein, the automated editing system additionally compiles an audio commentary taken from a library of commentaries and using the known outcome of an event during a period of play.
14. A method of compiling an audio commentary for video footage comprising the steps of:
accessing a library of commentaries; and using the known outcome of an event during a period of play, compiling the audio commentary.
15. A system for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices whereby a means is provided to perform triangulation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject or scene.
16. The system according to claim 15, wherein said means comprises an automatic editing device (250).
17. The system according to claim 16, further comprising data (200) from the at least two mobile video recording devices (20a, 20b); and wherein said automated editing device is configured to use said data in the triangulation process.
18. The system according to claim 17 wherein said automated editing device (250) is configured to perform reverse triangulation to determine the position of each mobile video recording device (20a, 20b) relative to the field of movement (10) of the subjects or objects being filmed. 9. The system according to claim 7 wherein said automated editing system (250) is configured to determine which camera (20a, 20b) represents the best or closest or most representative view of the current play based on the location of the subject or object (30) in the field of play ( 0).
20. A computer-usable medium for automatically producing a video of a subject or scene based on images from at least two mobile video recording devices, said computer-usable medium embodying computer program code, said computer program code comprising computer executable instructions configured to perform triangulation to determine the relative positions of the plurality of mobile video recording devices with respect to the subject/scene.
PCT/GB2011/001264 2011-08-23 2011-08-23 Improvements in automatic video production WO2013026991A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/GB2011/001264 WO2013026991A1 (en) 2011-08-23 2011-08-23 Improvements in automatic video production

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/GB2011/001264 WO2013026991A1 (en) 2011-08-23 2011-08-23 Improvements in automatic video production

Publications (1)

Publication Number Publication Date
WO2013026991A1 true WO2013026991A1 (en) 2013-02-28

Family

ID=44583182

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2011/001264 WO2013026991A1 (en) 2011-08-23 2011-08-23 Improvements in automatic video production

Country Status (1)

Country Link
WO (1) WO2013026991A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL2012399A (en) * 2014-03-11 2015-11-19 De Vroome Poort B V Autonomous camera system for capturing sporting events.
WO2018004354A1 (en) * 2016-07-01 2018-01-04 Teameye As Camera system for filming sports venues

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1480450A2 (en) * 2003-05-20 2004-11-24 British Broadcasting Corporation Automated video production
US20090063419A1 (en) * 2007-08-31 2009-03-05 Jukka Kalevi Nurminen Discovering peer-to-peer content using metadata streams
US20090148124A1 (en) * 2007-09-28 2009-06-11 Yahoo!, Inc. Distributed Automatic Recording of Live Event
US20100014750A1 (en) * 2008-07-18 2010-01-21 Fuji Xerox Co., Ltd. Position measuring system, position measuring method and computer readable medium
US20100026809A1 (en) * 2008-07-29 2010-02-04 Gerald Curry Camera-based tracking and position determination for sporting events

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1480450A2 (en) * 2003-05-20 2004-11-24 British Broadcasting Corporation Automated video production
US20090063419A1 (en) * 2007-08-31 2009-03-05 Jukka Kalevi Nurminen Discovering peer-to-peer content using metadata streams
US20090148124A1 (en) * 2007-09-28 2009-06-11 Yahoo!, Inc. Distributed Automatic Recording of Live Event
US20100014750A1 (en) * 2008-07-18 2010-01-21 Fuji Xerox Co., Ltd. Position measuring system, position measuring method and computer readable medium
US20100026809A1 (en) * 2008-07-29 2010-02-04 Gerald Curry Camera-based tracking and position determination for sporting events

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL2012399A (en) * 2014-03-11 2015-11-19 De Vroome Poort B V Autonomous camera system for capturing sporting events.
EP2966851A1 (en) * 2014-03-11 2016-01-13 De Vroome Poort B.V. Autonomous camera system for capturing sporting events
WO2018004354A1 (en) * 2016-07-01 2018-01-04 Teameye As Camera system for filming sports venues

Similar Documents

Publication Publication Date Title
US12135867B2 (en) Methods and systems for presenting direction-specific media assets
US8588824B2 (en) Transferring media context information based on proximity to a mobile device
CN106257930B (en) Generate the dynamic time version of content
CN102547479B (en) The generation of media metadata and supply
US9578365B2 (en) High quality video sharing systems
AU2019216671A1 (en) Method and apparatus for playing video content from any location and any time
US10212325B2 (en) Systems and methods to control camera operations
WO2018102243A1 (en) Live video recording, streaming, viewing, and storing mobile application, and systems and methods of use thereof
CN112822563A (en) Method, device, electronic equipment and computer readable medium for generating video
US11048748B2 (en) Search media content based upon tempo
CN104335594A (en) Automatic digital curation and tagging of action videos
WO2014001607A1 (en) Video remixing system
US11277668B2 (en) Methods, systems, and media for providing media guidance
CN107172502B (en) Virtual reality video playing control method and device
US10924803B2 (en) Identifying viewing characteristics of an audience of a content channel
JP2019033430A (en) Movie reproduction apparatus, control method thereof, and program
CN111800668A (en) Bullet screen processing method, device, equipment and storage medium
CN113315980A (en) Intelligent live broadcast method and live broadcast Internet of things system
US20120099842A1 (en) Editing apparatus, editing method, program, and recording media
WO2013026991A1 (en) Improvements in automatic video production
KR101958936B1 (en) Method and system for constructing content of interest in a tv channel
US10137371B2 (en) Method of recording and replaying game video by using object state recording method
Fujisawa et al. Automatic content curation system for multiple live sport video streams
CN112969028A (en) Intelligent live broadcast method and live broadcast Internet of things system
KR102372181B1 (en) Display device and method for control thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11752321

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11752321

Country of ref document: EP

Kind code of ref document: A1