WO2015195390A1 - Multiple viewpoints of an event generated from mobile devices - Google Patents

Multiple viewpoints of an event generated from mobile devices Download PDF

Info

Publication number
WO2015195390A1
WO2015195390A1 PCT/US2015/034661 US2015034661W WO2015195390A1 WO 2015195390 A1 WO2015195390 A1 WO 2015195390A1 US 2015034661 W US2015034661 W US 2015034661W WO 2015195390 A1 WO2015195390 A1 WO 2015195390A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
event
location
contents
video
Prior art date
Application number
PCT/US2015/034661
Other languages
French (fr)
Inventor
Guillaume Andre Roger GOUSSARD
Arden A. Ash
Ray Edward STARCK
Joel M. Fogelson
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2015195390A1 publication Critical patent/WO2015195390A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41407Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4524Management of client data or end-user data involving the geographical location of the client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8583Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by creating hot-spots

Definitions

  • the present principles of the embodiments generally relate to a method and apparatus for processing, collating, and/or presenting media content for events captured at a plurality of viewpoint positions at an event.
  • High-quality video and audio capture capability is now standard in mobile devices. With mobile devices nearly always present, more and more users are able to capture events on camera and share instantly. In fact, multiple individuals will often share the same event; when different people attend an event such as a concert or a fashion show, many will use their mobile phone, camera, tablet, computer, and the like as a capture device to record the audio and/or the video of some, or all, of the event.
  • an object of the present invention is to provide solutions and improvements to event media asset management in accordance with the principles of the present invention and as recognized by the present inventors. Accordingly, an exemplary embodiment provides a framework for acquiring media, e.g. , videos of an event using time and geographic information, processing such videos into a cohesive grouping, and then presenting such videos in a format that allows a user to access different videos of the same event where each video represents a different viewpoint of such an event. Such videos can be presented in a form of micro-channels (e.g., each channel per an event), lists, and the like. Additional features and processing to these contents are also provided.
  • media e.g., videos of an event using time and geographic information
  • a method comprising:
  • the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data,
  • the first data indicating a location of the first viewing point, the second data indicating a first time
  • the third data including a first video content exhibiting a program event captured from the first viewing point at the first time
  • the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content
  • the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
  • the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at the second viewing point at the second time, the second audio content associated with the second video content;
  • the group of contents comprising the first video and first audio contents and the second video and second audio contents;
  • the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
  • an apparatus is presented:
  • means including a communication interface, for receiving a first plurality of data from a first device and a third plurality of data from a second device, the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data, the first data indicating a location of the first viewing point, the second data indicating a first time, the third data including a first video content exhibiting a program event captured from the first viewing point at the first time, the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content; the second device located at a second viewing point within the area of location, the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at
  • means including a processor, for providing a second plurality of data, the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
  • means including the processor and an event identification database, for identifying the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data;
  • means including the processor and a media asset storage, for forming a group of contents for the program event, the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, the group of contents comprising the first video and first audio contents and the second video and second audio contents; and
  • means including the processor, the event identification database, and the media asset storage, for providing the group of contents in such a way that a combination of the first video and the second audio contents is played in a synchronized manner and a graphical representation of the area of location, the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
  • an apparatus comprising:
  • a communication interface for receiving a first plurality of data from a first device and a third plurality of data from a second device, the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data, the first data indicating a location of the first viewing point, the second data indicating a first time, the third data including a first video content exhibiting a program event captured from the first viewing point at the first time, the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content; the second device located at a second viewing point within the area of location, the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at the second viewing
  • a processor for providing a second plurality of data, the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
  • the processor identifies the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, and forms a group of contents for the program event, the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, the group of contents comprising the first video and first audio contents and the second video and second audio contents;
  • the processor provides the group of contents in such a way that a combination of the first video and the second audio contents is played in a synchronized manner and a graphical representation of the area of location, the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
  • a method comprising: receiving a plurality of contents from a plurality of devices at a location having an event,
  • the graphical representation including superimposed the location information from the plurality of devices.
  • FIG. 1 shows an exemplary user interface of an apparatus according to the principals of the present invention
  • FIG. 2 shows a timeline representation of various media assets recording a particular event according to the principals of the present invention.
  • FIG. 3 shows an exemplary system according to the principles of the present invention
  • FIG. 4 shows an exemplary process according to the principles of the present invention.
  • FIG. 5 shows another exemplary process according to the principles of the present invention.
  • FIG. 3 represents one exemplary embodiment of a system according the principles of the present invention.
  • various mobile devices such as, e.g., mobile phones 350-1 to 350-n, camera 1 , tablet 352, and laptop computer 353 may be used by various users at an event and at different viewpoints of the event location to record and transmit media asset information, such as, e.g., video, audio, location information (e.g., GPS), orientation, device brand, annotations, and additional miscellaneous metadata, etc., to a website or server 300, via a network such as the internet 398.
  • media asset information such as, e.g., video, audio, location information (e.g., GPS), orientation, device brand, annotations, and additional miscellaneous metadata, etc.
  • Website or server 300 receives the data from the mobile devices through network 398 via a communication Interface unit 31 0.
  • the data is accessed by a processor 320, which processes and correlates the data and stores the relevant information in an event identification database 330 and media asset storage appropriately.
  • Event identification database 330 stores information about each event including information such as, e.g., a graphical representation of the location, time duration, and description information for the program event.
  • Processor 320 compares the geographic location and timing information of each of the various received media assets with the information in the event identification database 330, to determine if assets are from the same event.
  • processor 320 groups the assets and provides the group information to communication interface unit 310, which allows access to the data by the various mobile devices, allowing users at an event to select alternate viewpoints from other users at the same event in real time, or move to another location with a different viewpoint as appropriate or desired.
  • FIG 3 Also shown in FIG 3 is a detailed block diagram of an exemplary mobile device 350-1 for use according to the principles of the present invention.
  • 350-1 is an example of a typical cellphone such as, e.g., an Android phone (e.g., Samsung S3, S4, or S5), or an Apple IOS phone (e.g., IPhone 5S or 5C).
  • an Android phone e.g., Samsung S3, S4, or S5
  • Apple IOS phone e.g., IPhone 5S or 5C.
  • Such phones would typically include, e.g., a processor 395 for processing various data and controlling the various components of the phone, a microphone 385 and a camera 390 for recording audio and video contents, user I/O 370 including a virtual touch keyboard and a display for inputting and outputting user data, GPS circuitries 365 for processing GPS positioning information, Gyroscope circuitries 375 for processing phone orientation information, memory 380 for storing various information as necessary, and a communication interface 360 for connecting and
  • Wi-Fi Wireless Fidelity
  • a cellphone network such as 3G, 4G, LTE, etc.
  • an exemplary embodiment defines an event, including the location, time, and description of the event.
  • a definition of a geographic location in which an event takes place is defined.
  • a location such as an arena is defined e.g., by positional (GPS) coordinates and an orientation.
  • the positional coordinates may be defined in a variety of manners, including a single point, a single point and a radius, or multiple points circumscribing the area the venue (a locus of positional coordinates).
  • Such areas can be defined by having a map or graphical depiction of different locations and having GPS, or positional, information located in a database.
  • Such locations can represent concert venues, sporting arenas, museums, and the like, etc.
  • Defining the locations in finer granularity may be important when there are multiple events taking place in a geographic location. For example, during a music festival, there may be multiple venues that are used for different concerts. The use of location information for such venues would represent whether a user was in attendance at a first event at a first location, versus a second event at a second location. This could be done automatically as video and audio content arrive, based on information provided along with the videos. Hence, a plurality of contents from multiple users may be grouped together automatically into the same event without a user needing to, e.g., signing up or signing on to that event, or to first explicitly search or identify that event some other way. As shown in FIG.
  • an event identification database 330 for identifying events may be used that contains these GPS, or positional, information, orientations, and time durations that uniquely identify each event.
  • Time durations define the time during which the event occurred. For example, time durations could be described by start and stop dates and times, a start date and time and the length of time the event lasts, or any other configuration that provides sufficient boundaries to accurately describe when an event occurred.
  • Exemplary records in the database 330 may be represented as, for example:
  • Location and time information from a mobile device can then be referenced against this database, where the location information and time information would infer whether a user was at a first or a second event.
  • One optional embodiment would allow the database information to be sent to the mobile devices prior to uploading or streaming content from the mobile device, such that the event(s) would be identified automatically as the video was being recorded and/or prior to being submitted.
  • Audio and video contents can be captured through integrated cameras 390 and microphones 385.
  • Processor 395 may compress, alter, or modify the audio and video, and then store the media in memory 380 or stream the video to website or server 300 via a network 398 for further processing by server 300 according to the principles of the present invention.
  • the following information may be retained: location, time, device information (e.g., manufacturer or cellular service provider, etc.), and other miscellaneous data tags that a user may want to make about an event.
  • a user of the mobile device may provide annotations during the recording of the event by entering them using text messages or Twitter feeds which are added to his or her video stream, where the texts and/or tweets will pops up at a certain time in the video (e.g., when rendered).
  • FIG. 1 shows an exemplary embodiment of a user interface screen according to the principles of the present invention.
  • the display screen shown in FIG. 1 may be provided by an App on any of the mobile devices in FIG. 3, and/or by website/server 300.
  • an annotation window 140 is provided to display the annotations entered by a user as described above.
  • This annotation information 140 is referenced and correlated with, e.g., time information and/or video contents. Such information is updated through the video capture process.
  • textual annotations can be added to a video stream before it is uploaded to the described website or server 300.
  • Location may be captured from the mobile devices in a variety of methods. While functions provided by GPS circuitries 365 are often used to determine the position of a mobile device, the location information does not necessarily need to be derived from GPS information. The device position could alternatively be determined through a variety of other technologies, including but not limited to triangulation based on cellular signal strength or wireless signal strength.
  • the positional acquisition phase provides a user to tag the area in which they reside. For example, a user inputs the seats that they are residing at in a location.
  • a graphic of a location is rendered which can be provided in response to GPS information or textual input, where a user indicates where they are sitting or located relative to the map (boxes, terrace, promenade, etc).
  • This alternative locational metadata information is added to the video and/or audio media asset.
  • Another component to the location information that may be utilized is the positional information for the point of view of a camera. Such information could be read from e.g., a compass or gyroscope 375 that may be present in a mobile device such as a mobile phone 350-1 of FIG. 3.
  • the internal clock of a mobile device may not be sufficient in order to represent an accurate time of a mobile device.
  • Multiple techniques including reading the clock of signal from a cell phone provider, GPS clock information, accessing time information from an atomic clock service (http://www.time.gov/) can be referenced for accurate time information if the internal clock information is insufficiently accurate. It is important for the time information between devices to be as consistent as possible. It is desirable that the different videos of the same event be coordinated against the same timeline so the different videos can be accessed of the same event when being rendered.
  • video window 1 10 of FIG. 1 may show a video taken from a first viewpoint by a first user
  • video window 1 15 may show a second video taken from a second viewpoint by a second user of the same event.
  • Audio information can also be captured from the different mobiles devices of FIG. 1 , where the "best quality" audio can be used as the reference. Such audio can be selected based on the fidelity quality of such audio which can be determined in accordance with known methods.
  • a separate audio track of an event is captured from the mixing board of an event where such audio is referenced against the same timeline.
  • the user can select which audio to play, for example, audio A1 , audio A2, or audio A3, via selection icons 131 , 132, and 133 respectively, instead of primary audio 130, as shown in FIG. 1 .
  • audio tracks can be used as a form of commentary of the same event.
  • a user can use the microphone 385 located in their mobile devices and talk about an event while it is taking place. This information would exist as a separate audio track which could be mixed onto a base audio track which is then referenced against the same timeline.
  • the audio commentary can be processed where any audio not representing a human voice is filtered from the audio using audio processing techniques.
  • the device information such as manufacturer's brand, cellphone provider, etc.
  • the device information is used to determine whether or not the content from a certain device gets preference over other content. This can occur if, e.g. , a device manufacturer sponsors an event at a particular venue. For example, for an event, Samsung may want to promote content generated from their devices if Samsung sponsored an event. This is illustrated in FIG. 1 , where a video 1 05 is being shown more prominently in a larger video display window 1 05, since the content is from a Samsung device. Any of the metadata that are derived from an event may be further processed by server 300 to provide additional features according to the principles of the present invention.
  • Video and/or audio information with the respective location, time, user device, annotation, and miscellaneous information such as an image of the user or indications of mood may be retained and grouped together to represent an event.
  • the location (within a range) and time information e.g. , start, stop and/or duration
  • E 1 ; E 2 . . . E n specific events
  • All assets for a single event may then be synchronized and referenced against a single timeline using the time metadata that is associated with those assets.
  • Ei (Event 1 ) is associated with a GPS Range of Values ⁇ R1 ... ⁇ for a Time Duration 1 .
  • E 2 (Event 2) is associated with a GPS Range of Values ⁇ R2... ⁇ at a Time Duration 2.
  • E 3 (Event 3) is associated with a GPS Range of Values ⁇ R3... ⁇ at a Time Duration 3.
  • A-i, A 4 , and A 9 represent baseline audio
  • FIG. 2 represents an exemplary embodiment of present invention where multiple video, audio and annotation contents may be presented and correlated, all referenced against a common timeline.
  • a first event Eventi may comprise a first video Vi 200 and a first audio track Ai 215 from a mobile phone capturing the entire event on stage, with no textual annotation at the start of the event.
  • a third video V 5 210 from yet another device is then able to be placed appropriately on the timeline as well.
  • the first video Vi 200 may be displayed on e.g., sub screen 105 of FIG. 1
  • the second video V 205 may be displayed on e.g., sub screen 1 1 0 of FIG. 1
  • the third video V 5 210 could be displayed on sub screen 1 1 5, when the appropriate time had been reached while Eventi was being viewed.
  • a first, main audio Ai 21 5 could also be uploaded and positioned appropriately on the timeline, and a second audio A 3 220 and third audio A 5 225 could be placed on the timeline in the same fashion as the first, second, and third video.
  • the various audio tracks could, in one embodiment, then be available for selection on the user interface shown in FIG. 1 and explained before, when the time associated with those audio clips was reached.
  • any annotations could be placed on the timelines at appropriate times, such as the time in which the annotations were made.
  • a first annotation ANi 230 could be displayed in the annotation window 140 of the user interface of FIG. 1 before the second and third videos are available for display.
  • a second annotation AN 2 235 could be displayed during a time when all three video assets are available for viewing, and a third annotation AN 3 240, could be displayed while the first and third videos were available for viewing, etc.
  • Media assets after being grouped together, can be listed as different events, where such events can be accessed from a web site or a media asset service such as M-Go.
  • events could be listed in the following manner:
  • a representation of the venue of the event is shown where the various viewpoints of the venue (V1 - Garden Boxes, V4 - Promenade, V5 - Ramp Seats) may be selected if such video assets are available (refer to FIG. 2 for the timing of video events).
  • the availability of different viewpoints are represented by, e.g., symbols of a person, 151 - 153, representing the presence and the specific locations of different users within the event venue. Note that if a video is only available for part of an event presentation, the person symbol will only appear coincident in time to the availability of the video representing that viewpoint.
  • annotation window 140 displays various textual and/or other annotations that different users made during the event.
  • content navigation section comprising a timeline 141 and various user controls 142 (e.g., play, stop, fast forward, and reverse), and a slider 143, for navigating the contents, to for example, play, stop or fast forward, or reserve the contents.
  • audio track Ai 1 31 may be used as the primary audio track while secondary audio tracks A 3 1 32 and A 5 1 33 (e.g., additional audio commentaries) may be mixed in together with the primary audio as described previously.
  • secondary audio tracks A 3 1 32 and A 5 1 33 e.g., additional audio commentaries
  • playback windows other than the exemplary ones shown in FIG. 1 may be constructed and rendered in accordance with the principles of the present inventions.
  • One optional embodiment of the disclosed principles provide an audio
  • sync watermark that is placed within the audio played at a venue (e.g., the music of a concert/fashion show) and the like.
  • This sync information can be used to synchronize the video of people located at different venues so the "background" audio of a recording can match up.
  • a person 1 53 recording a video at the front row of seats at a venue or event location 1 50 and a person 1 51 recording video at the back of a venue or event location 1 50 will have disparities in the background audio if a common timestamp format is used.
  • the audio syncs can be used to adjust the timing between both people so that if the video at a website is switched between the front row and the back row, potential problems with audio sync are minimized. In streaming audio, this kind of problems is known as audio drift.
  • a user can be presented with viewpoint options of their friends to whom they are linked on a social network such as Facebook or M-Go Social.
  • a user who generated ⁇ at location 153 is linked to the user who generated V at location 1 52 via a social networking site, but not the user who generated V 5 through location 1 51 .
  • the user associated with ⁇ could only see content generated from a friend V but not a stranger V 5 .
  • the device information referenced above is used to determine whether or not the content from a certain device gets preference over other content. This can occur if, e.g., a device manufacturer sponsors an event at a particular venue.
  • Samsung may want to promote content generated from their devices if Samsung sponsored an event. This content may be displayed more prominently as shown in the larger sub window 105 of FIG.1 . Content generated from other devices made by other manufacturers may not be shown or shown in smaller, less prominent windows 1 10 and 1 1 5 as illustrated in FIG. 1 .
  • a filtering mode can be offered where video effects affect the playback or recording of videos based on the direction plane a user moves their mobile/recording device.
  • a different filter can be applied to the color of a video recording if the mobile device is moved along the e.g., X-Y axis (e.g., lighter to darker; more to less saturation, etc.) while a fuzzy/sharpen filter can be applied in a second direction, e.g. along the Y-Z axis.
  • This lets a person adjust the video filters on the fly. This is something that neither Vine nor Instagram do currently.
  • a user could specify different filters in accordance with different directions (black and white/color transition, noise, film grain, etc.)
  • FIG. 4 shows an exemplary embodiment of a process according to the principles of the present invention.
  • a first plurality of data comprising e.g., the location, the time, video content, and audio content and/or various metadata are sent from this device to a server 300 and received by the server 300.
  • the video and audio contents are to be pre-recorded prior to submission.
  • Another embodiment allows the contents to be streamed live.
  • a second plurality of data is provided.
  • the second plurality of data may include a graphical representation of the area of the event location, time duration of the program event, and/or a description of the program event.
  • the graphical representation of the event may be processed and shown as illustrated in the event location window 150 of FIG. 1 in which an representative image of the location is shown. Further information may be provided including, e.g., GPS coordinates, number of user and content sources 151 - 153 at the event location. For example, a
  • representation of a football stadium may simply be an overhead image of the stadium, or it may be a graphical representation of the seating chart along with GPS coordinates circumscribing the stadium.
  • a third plurality of data comprising e.g., the location, the time, video content, and audio content and/or various metadata are sent from a second device, located at some viewing point within the area of the event location, to a server 300 and received by the server 300.
  • a program event being common between the first and second video contents is identified based upon, e.g., the time and location information included in the pluralities of first, second, and third pluralities of data.
  • step 425 these common media assets are examined and arranged in such a way that the video and audio from each of the sources is
  • step 430 graphical representation is provided of the event location to a user, along with identifying viewing points on or near that representation, allowing the user to select viewing points on the user interface to select a video asset to view.
  • a sponsor of an event could highlight or emphasize viewing points that match a particular device manufacturer. Alternatively, this could be from individuals linked to certain social media groups, or individuals who use a particular hashtag to annotate the video.
  • audio content with higher quality is utilized regardless of video content selection.
  • FIG 5 is another exemplary process according to the principles of the present invention.
  • a plurality of contents is received from a plurality of device at a location having an event.
  • a program event is automatically determined based on e.g., location information, timing information, video, audio, and/or annotation information from a plurality of devices.
  • a graphical representation of the event location is provided on a user interface.
  • Superimposed on this graphical representation are the locations of each available viewpoint or users providing their respective content at any point in time of an event.
  • FIG. 1 in the upper left box of the user interface, is the superimposition of graphical representations representing different viewpoints V 1 ; 151 , V 152 and V 5 153 at an event location 150.
  • This graphical representation could be displayed in a variety of ways, including a user interface of a device at the event, as noted in step 51 5, or accessed on a device after the event.
  • a user may decide to select an image to represent their viewpoint at any time.
  • the system instead of simply displaying a generic person symbol V 1 ; the system might display a close-up image of the actual user's face from his or her selfie picture.
  • step 525 allows that a user may modify their graphical representation at any time. For example, a user could identify their mood via a representative emoticon or image, as it changes during a football game; excited in the beginning, happy when they score and upset when they eventually lose the game.
  • a representative emoticon or image as it changes during a football game; excited in the beginning, happy when they score and upset when they eventually lose the game.
  • the icon may represent the team logo of the opposing teams to identify the allegiance of the user.
  • These graphical representations could be displayed on a device during the event, as seen in step 530, or at any other time that is desirable or useful. While several embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The presented embodiments generally relate to a method and apparatus for processing, collating, and/or presenting media content for events captured at a plurality of viewpoint positions at an event. Accordingly, a system and a method for acquiring contents of an event using time and geographic information, processing such contents into a cohesive grouping, and then presenting such contents in a format that allows a user to access different contents of the same event where each content represents a different viewpoint of such an event is described. Additional features and processing to these contents are also provided.

Description

MULTIPLE VIEWPOINTS OF AN EVENT GENERATED FROM MOBILE
DEVICES
CROSS REFERENCE TO RELATED APPLICATIONS
This application claims priority to and all benefits accruing from the provisional application filed in the United States Patent and Trademark Office on June 18, 2014 and assigned serial numbers 62/013,622.
BACKGROUND OF THE INVENTION
Field of the Invention
The present principles of the embodiments generally relate to a method and apparatus for processing, collating, and/or presenting media content for events captured at a plurality of viewpoint positions at an event. Background Information
High-quality video and audio capture capability is now standard in mobile devices. With mobile devices nearly always present, more and more users are able to capture events on camera and share instantly. In fact, multiple individuals will often share the same event; when different people attend an event such as a concert or a fashion show, many will use their mobile phone, camera, tablet, computer, and the like as a capture device to record the audio and/or the video of some, or all, of the event.
After capturing video and audio, these individuals upload the video to video sharing sites such as YouTube® or Vine®, where different videos of varying quality are used to showcase the same event. A problem exists in that no service exists that groups such videos together to form representations of such events and to provide additional processing and features for these contents.
SUMMARY OF THE INVENTION An object of the present invention is to provide solutions and improvements to event media asset management in accordance with the principles of the present invention and as recognized by the present inventors. Accordingly, an exemplary embodiment provides a framework for acquiring media, e.g. , videos of an event using time and geographic information, processing such videos into a cohesive grouping, and then presenting such videos in a format that allows a user to access different videos of the same event where each video represents a different viewpoint of such an event. Such videos can be presented in a form of micro-channels (e.g., each channel per an event), lists, and the like. Additional features and processing to these contents are also provided.
In accordance with an exemplary embodiment, a method is presented, comprising:
receiving a first plurality of data from a first device, the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data,
the first data indicating a location of the first viewing point, the second data indicating a first time, the third data including a first video content exhibiting a program event captured from the first viewing point at the first time, the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content;
providing a second plurality of data, the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
receiving a third plurality of data from a second device, the second device located at a second viewing point within the area of location, the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at the second viewing point at the second time, the second audio content associated with the second video content;
identifying the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data;
forming a group of contents for the program event, the group of contents comprising the first video and first audio contents and the second video and second audio contents;
processing the group of contents in such a way that a combination of the first video and the second audio contents is played in a synchronized manner; and
providing a graphical representation of the area of location, the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
In another exemplary embodiment, an apparatus is presented:
means, including a communication interface, for receiving a first plurality of data from a first device and a third plurality of data from a second device, the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data, the first data indicating a location of the first viewing point, the second data indicating a first time, the third data including a first video content exhibiting a program event captured from the first viewing point at the first time, the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content; the second device located at a second viewing point within the area of location, the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at the second viewing point at the second time, the second audio content associated with the second video content;
means, including a processor, for providing a second plurality of data, the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
means, including the processor and an event identification database, for identifying the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data;
means, including the processor and a media asset storage, for forming a group of contents for the program event, the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, the group of contents comprising the first video and first audio contents and the second video and second audio contents; and
means, including the processor, the event identification database, and the media asset storage, for providing the group of contents in such a way that a combination of the first video and the second audio contents is played in a synchronized manner and a graphical representation of the area of location, the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
In accordance another exemplary embodiment, an apparatus is presented, comprising:
a communication interface for receiving a first plurality of data from a first device and a third plurality of data from a second device, the first device located at a first viewing point within an area of location, the first plurality of data including first data, second data, third data, and fourth data, the first data indicating a location of the first viewing point, the second data indicating a first time, the third data including a first video content exhibiting a program event captured from the first viewing point at the first time, the fourth data including a first audio content captured at the first viewing point at the first time, the first audio content associated with the first video content; the second device located at a second viewing point within the area of location, the third plurality of data including eighth data, ninth data, tenth data, and eleventh data, the eighth data indicating a location of the second viewing point, the ninth data indicating a second time, the tenth data including a second video content exhibiting the program event captured from the second viewing point at the second time, the eleventh data including a second audio content captured at the second viewing point at the second time, the second audio content associated with the second video content;
a processor for providing a second plurality of data, the second plurality of data including fifth data, sixth data, and seventh data, the fifth data indicating a graphical representation of the area of location, the sixth data indicating a time duration of the program event, the seventh data including a description of the program event;
the processor identifies the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, and forms a group of contents for the program event, the program event being common between the first and second video contents based upon the time and location information included in the pluralities of first, second, and third pluralities of data, the group of contents comprising the first video and first audio contents and the second video and second audio contents; and
the processor provides the group of contents in such a way that a combination of the first video and the second audio contents is played in a synchronized manner and a graphical representation of the area of location, the graphical representation including user-selectable points representing respective ones of the first and second viewing points simultaneously on a user interface screen.
In accordance with another exemplary embodiment, a method is presented, comprising: receiving a plurality of contents from a plurality of devices at a location having an event,
determining automatically the event based on location information and timing information from the plurality of devices, and
providing a graphical representation of the location having the event, the graphical representation including superimposed the location information from the plurality of devices.
DETAILED DESCRIPTION OF THE DRAWINGS
The above-mentioned and other features and advantages of this invention, and the manner of attaining them, will become more apparent and the invention will be better understood by reference to the following description of embodiments of the invention taken in conjunction with the accompanying drawings, wherein:
FIG. 1 shows an exemplary user interface of an apparatus according to the principals of the present invention;
FIG. 2 shows a timeline representation of various media assets recording a particular event according to the principals of the present invention.
FIG. 3 shows an exemplary system according to the principles of the present invention;
FIG. 4 shows an exemplary process according to the principles of the present invention.
FIG. 5 shows another exemplary process according to the principles of the present invention.
The examples set out herein illustrate exemplary embodiments of the invention. Such examples are not to be construed as limiting the scope of the invention in any manner.
DETAILED DESCRIPTION
FIG. 3 represents one exemplary embodiment of a system according the principles of the present invention. As shown in FIG. 3, various mobile devices, such as, e.g., mobile phones 350-1 to 350-n, camera 1 , tablet 352, and laptop computer 353 may be used by various users at an event and at different viewpoints of the event location to record and transmit media asset information, such as, e.g., video, audio, location information (e.g., GPS), orientation, device brand, annotations, and additional miscellaneous metadata, etc., to a website or server 300, via a network such as the internet 398.
Website or server 300 receives the data from the mobile devices through network 398 via a communication Interface unit 31 0. The data is accessed by a processor 320, which processes and correlates the data and stores the relevant information in an event identification database 330 and media asset storage appropriately. Event identification database 330 stores information about each event including information such as, e.g., a graphical representation of the location, time duration, and description information for the program event. Processor 320 compares the geographic location and timing information of each of the various received media assets with the information in the event identification database 330, to determine if assets are from the same event.
Assets belonging to the same event are grouped together and stored along with information identifying the event in media asset storage 340 in such a way to allow users to view all of that particular event's assets in a synchronized fashion. Alternatively, processor 320 groups the assets and provides the group information to communication interface unit 310, which allows access to the data by the various mobile devices, allowing users at an event to select alternate viewpoints from other users at the same event in real time, or move to another location with a different viewpoint as appropriate or desired.
Also shown in FIG 3 is a detailed block diagram of an exemplary mobile device 350-1 for use according to the principles of the present invention. 350-1 is an example of a typical cellphone such as, e.g., an Android phone (e.g., Samsung S3, S4, or S5), or an Apple IOS phone (e.g., IPhone 5S or 5C). Such phones would typically include, e.g., a processor 395 for processing various data and controlling the various components of the phone, a microphone 385 and a camera 390 for recording audio and video contents, user I/O 370 including a virtual touch keyboard and a display for inputting and outputting user data, GPS circuitries 365 for processing GPS positioning information, Gyroscope circuitries 375 for processing phone orientation information, memory 380 for storing various information as necessary, and a communication interface 360 for connecting and
communicating to, e.g., the internet via a Wi-Fi and/or a cellphone network such as 3G, 4G, LTE, etc.
According to the principles of the present invention, an exemplary embodiment defines an event, including the location, time, and description of the event. First, a definition of a geographic location in which an event takes place is defined. For example, a location such as an arena is defined e.g., by positional (GPS) coordinates and an orientation. The positional coordinates may be defined in a variety of manners, including a single point, a single point and a radius, or multiple points circumscribing the area the venue (a locus of positional coordinates). Such areas can be defined by having a map or graphical depiction of different locations and having GPS, or positional, information located in a database. Such locations can represent concert venues, sporting arenas, museums, and the like, etc.
Defining the locations in finer granularity may be important when there are multiple events taking place in a geographic location. For example, during a music festival, there may be multiple venues that are used for different concerts. The use of location information for such venues would represent whether a user was in attendance at a first event at a first location, versus a second event at a second location. This could be done automatically as video and audio content arrive, based on information provided along with the videos. Hence, a plurality of contents from multiple users may be grouped together automatically into the same event without a user needing to, e.g., signing up or signing on to that event, or to first explicitly search or identify that event some other way. As shown in FIG. 3 and as described previously, an event identification database 330 for identifying events may be used that contains these GPS, or positional, information, orientations, and time durations that uniquely identify each event. Time durations define the time during which the event occurred. For example, time durations could be described by start and stop dates and times, a start date and time and the length of time the event lasts, or any other configuration that provides sufficient boundaries to accurately describe when an event occurred. Exemplary records in the database 330 may be represented as, for example:
GPS Coordinates Orientation Time Duration Event
X, Y, Z 0° 12:00-14:00 Concern
X, Y, Z 3° 14:00-15:30 Concert 2
P, Q, R 30° 12:00-14:00 Concert 3
P, Q, R 33° 14:00-16:00 Concert 4
Location and time information from a mobile device can then be referenced against this database, where the location information and time information would infer whether a user was at a first or a second event. One optional embodiment would allow the database information to be sent to the mobile devices prior to uploading or streaming content from the mobile device, such that the event(s) would be identified automatically as the video was being recorded and/or prior to being submitted.
As noted before, users may take audio and video contents at a program event using their cellphones such as a cellphone 350-1 shown in FIG. 3. Audio and video contents can be captured through integrated cameras 390 and microphones 385. Processor 395 may compress, alter, or modify the audio and video, and then store the media in memory 380 or stream the video to website or server 300 via a network 398 for further processing by server 300 according to the principles of the present invention. For each video or audio recorded and received by server 300, for example, the following information may be retained: location, time, device information (e.g., manufacturer or cellular service provider, etc.), and other miscellaneous data tags that a user may want to make about an event. In one embodiment of the present invention, a user of the mobile device may provide annotations during the recording of the event by entering them using text messages or Twitter feeds which are added to his or her video stream, where the texts and/or tweets will pops up at a certain time in the video (e.g., when rendered).
FIG. 1 shows an exemplary embodiment of a user interface screen according to the principles of the present invention. The display screen shown in FIG. 1 may be provided by an App on any of the mobile devices in FIG. 3, and/or by website/server 300. According to one embodiment, an annotation window 140 is provided to display the annotations entered by a user as described above. This annotation information 140 is referenced and correlated with, e.g., time information and/or video contents. Such information is updated through the video capture process. Alternatively, textual annotations can be added to a video stream before it is uploaded to the described website or server 300.
Location may be captured from the mobile devices in a variety of methods. While functions provided by GPS circuitries 365 are often used to determine the position of a mobile device, the location information does not necessarily need to be derived from GPS information. The device position could alternatively be determined through a variety of other technologies, including but not limited to triangulation based on cellular signal strength or wireless signal strength. In an optional embodiment of the positional acquisition phase provides a user to tag the area in which they reside. For example, a user inputs the seats that they are residing at in a location. In another embodiment, a graphic of a location is rendered which can be provided in response to GPS information or textual input, where a user indicates where they are sitting or located relative to the map (boxes, terrace, promenade, etc). This alternative locational metadata information is added to the video and/or audio media asset. Another component to the location information that may be utilized is the positional information for the point of view of a camera. Such information could be read from e.g., a compass or gyroscope 375 that may be present in a mobile device such as a mobile phone 350-1 of FIG. 3.
For time information, the internal clock of a mobile device may not be sufficient in order to represent an accurate time of a mobile device. Multiple techniques including reading the clock of signal from a cell phone provider, GPS clock information, accessing time information from an atomic clock service (http://www.time.gov/) can be referenced for accurate time information if the internal clock information is insufficiently accurate. It is important for the time information between devices to be as consistent as possible. It is desirable that the different videos of the same event be coordinated against the same timeline so the different videos can be accessed of the same event when being rendered. As an example, video window 1 10 of FIG. 1 may show a video taken from a first viewpoint by a first user, and video window 1 15 may show a second video taken from a second viewpoint by a second user of the same event.
Audio information can also be captured from the different mobiles devices of FIG. 1 , where the "best quality" audio can be used as the reference. Such audio can be selected based on the fidelity quality of such audio which can be determined in accordance with known methods. In an optional embodiment, a separate audio track of an event is captured from the mixing board of an event where such audio is referenced against the same timeline. In another embodiment, the user can select which audio to play, for example, audio A1 , audio A2, or audio A3, via selection icons 131 , 132, and 133 respectively, instead of primary audio 130, as shown in FIG. 1 .
In a further optional embodiment, audio tracks can be used as a form of commentary of the same event. A user can use the microphone 385 located in their mobile devices and talk about an event while it is taking place. This information would exist as a separate audio track which could be mixed onto a base audio track which is then referenced against the same timeline. For example, the audio commentary can be processed where any audio not representing a human voice is filtered from the audio using audio processing techniques.
In an optional embodiment, the device information such as manufacturer's brand, cellphone provider, etc., is used to determine whether or not the content from a certain device gets preference over other content. This can occur if, e.g. , a device manufacturer sponsors an event at a particular venue. For example, for an event, Samsung may want to promote content generated from their devices if Samsung sponsored an event. This is illustrated in FIG. 1 , where a video 1 05 is being shown more prominently in a larger video display window 1 05, since the content is from a Samsung device. Any of the metadata that are derived from an event may be further processed by server 300 to provide additional features according to the principles of the present invention. Video and/or audio information with the respective location, time, user device, annotation, and miscellaneous information such as an image of the user or indications of mood, may be retained and grouped together to represent an event. As multiple devices upload unique viewpoints of various events, the location (within a range) and time information (e.g. , start, stop and/or duration) are used to group the assets together into specific events (E1 ; E2 . . . En). All assets for a single event may then be synchronized and referenced against a single timeline using the time metadata that is associated with those assets.
One example of groupings of different assets could be:
Ei - Vi , V4, V5 / A1 ; A3, A5 / ΑΝΊ , AN2, AN3/ GPS Range 1 / Time 1 E2 - V2, V3/ A2, A4, A6/ AN4, AN5/GPS Range 2 / Time 2
E3 - V6, V7 / A7, A8, A9 / AN6, AN , AN8/ GPS Range 3 / Time 3,
Where:
Ei (Event 1 ) is associated with a GPS Range of Values {R1 ...} for a Time Duration 1 . E2 (Event 2) is associated with a GPS Range of Values {R2...} at a Time Duration 2.
E3 (Event 3) is associated with a GPS Range of Values {R3...} at a Time Duration 3.
A-i, A4, and A9 (which are bolded) represent baseline audio
presentations that could originate from an audio mixing board of an event. Other audio events (A2, A3, A5, A6, A7, A8) can be superimposed to A s A4 or A9, in accordance with the principles described above. FIG. 2 represents an exemplary embodiment of present invention where multiple video, audio and annotation contents may be presented and correlated, all referenced against a common timeline. For example, a first event Eventi may comprise a first video Vi 200 and a first audio track Ai 215 from a mobile phone capturing the entire event on stage, with no textual annotation at the start of the event. A second video V 205 from another mobile device, starting halfway through the event, and lasting for a short period of time, is then positioned compared to the first video Vi 200. A third video V5 210 from yet another device, starting at about the same time as the second video V 205 but lasting significantly longer, is then able to be placed appropriately on the timeline as well. In this fashion, e.g., the first video Vi 200 may be displayed on e.g., sub screen 105 of FIG. 1 , the second video V 205 may be displayed on e.g., sub screen 1 1 0 of FIG. 1 , and the third video V5 210 could be displayed on sub screen 1 1 5, when the appropriate time had been reached while Eventi was being viewed.
Similarly, a first, main audio Ai 21 5 could also be uploaded and positioned appropriately on the timeline, and a second audio A3 220 and third audio A5225 could be placed on the timeline in the same fashion as the first, second, and third video. The various audio tracks could, in one embodiment, then be available for selection on the user interface shown in FIG. 1 and explained before, when the time associated with those audio clips was reached. Also, any annotations could be placed on the timelines at appropriate times, such as the time in which the annotations were made. In this example, a first annotation ANi 230 could be displayed in the annotation window 140 of the user interface of FIG. 1 before the second and third videos are available for display. A second annotation AN2235 could be displayed during a time when all three video assets are available for viewing, and a third annotation AN3240, could be displayed while the first and third videos were available for viewing, etc.
Media assets, after being grouped together, can be listed as different events, where such events can be accessed from a web site or a media asset service such as M-Go. For example, events could be listed in the following manner:
Event 1 - Hollywood Bowl / September 1 5, 201 3 - Earth Wind and Fire Event 2 - Hollywood Bowl / September 1 6, 201 3 - Hollywood Bowl Orchestra
Event 3 - Hollywood Bowl / September 1 7, 201 3 - Daft Punk and Friends
Based on the selection of an event, different options are presented to a user, which are shown as a sample user interface in FIG. 1 as described previously and further explained below.
A representation of the venue of the event is shown where the various viewpoints of the venue (V1 - Garden Boxes, V4 - Promenade, V5 - Ramp Seats) may be selected if such video assets are available (refer to FIG. 2 for the timing of video events). The availability of different viewpoints are represented by, e.g., symbols of a person, 151 - 153, representing the presence and the specific locations of different users within the event venue. Note that if a video is only available for part of an event presentation, the person symbol will only appear coincident in time to the availability of the video representing that viewpoint.
As described before, annotation window 140 displays various textual and/or other annotations that different users made during the event. In addition, there is also a content navigation section comprising a timeline 141 and various user controls 142 (e.g., play, stop, fast forward, and reverse), and a slider 143, for navigating the contents, to for example, play, stop or fast forward, or reserve the contents.
The different video contents of an event may be shown in video viewing section 1 00 of FIG 1 as described before. As seen in FIG. 1 , different audio tracks Ai 1 31 , A3 1 32, and A5 1 33 are also available for selection. Note that in one embodiment, audio track Ai 1 31 may be used as the primary audio track while secondary audio tracks A3 1 32 and A5 1 33 (e.g., additional audio commentaries) may be mixed in together with the primary audio as described previously. One skilled in the art can readily appreciate and recognize that different embodiments of playback windows other than the exemplary ones shown in FIG. 1 may be constructed and rendered in accordance with the principles of the present inventions. One optional embodiment of the disclosed principles provide an audio
"sync" watermark that is placed within the audio played at a venue (e.g., the music of a concert/fashion show) and the like. This sync information can be used to synchronize the video of people located at different venues so the "background" audio of a recording can match up. For example, a person 1 53 recording a video at the front row of seats at a venue or event location 1 50 and a person 1 51 recording video at the back of a venue or event location 1 50 will have disparities in the background audio if a common timestamp format is used. The audio syncs can be used to adjust the timing between both people so that if the video at a website is switched between the front row and the back row, potential problems with audio sync are minimized. In streaming audio, this kind of problems is known as audio drift.
In an optional embodiment, a user can be presented with viewpoint options of their friends to whom they are linked on a social network such as Facebook or M-Go Social. For example, a user who generated \ at location 153 is linked to the user who generated V at location 1 52 via a social networking site, but not the user who generated V5 through location 1 51 . Through such a connection, the user associated with \ could only see content generated from a friend V but not a stranger V5. As described before, in another optional embodiment, the device information referenced above is used to determine whether or not the content from a certain device gets preference over other content. This can occur if, e.g., a device manufacturer sponsors an event at a particular venue. For example, for Event 1 , Samsung may want to promote content generated from their devices if Samsung sponsored an event. This content may be displayed more prominently as shown in the larger sub window 105 of FIG.1 . Content generated from other devices made by other manufacturers may not be shown or shown in smaller, less prominent windows 1 10 and 1 1 5 as illustrated in FIG. 1 .
In an optional embodiment, a filtering mode can be offered where video effects affect the playback or recording of videos based on the direction plane a user moves their mobile/recording device. For example, a different filter can be applied to the color of a video recording if the mobile device is moved along the e.g., X-Y axis (e.g., lighter to darker; more to less saturation, etc.) while a fuzzy/sharpen filter can be applied in a second direction, e.g. along the Y-Z axis. This lets a person adjust the video filters on the fly. This is something that neither Vine nor Instagram do currently. Here, a user could specify different filters in accordance with different directions (black and white/color transition, noise, film grain, etc.)
FIG. 4 shows an exemplary embodiment of a process according to the principles of the present invention. In step 400 of FIG. 4, a first plurality of data, comprising e.g., the location, the time, video content, and audio content and/or various metadata are sent from this device to a server 300 and received by the server 300. In one embodiment, the video and audio contents are to be pre-recorded prior to submission. Another embodiment allows the contents to be streamed live.
In step 405 of FIG. 4, a second plurality of data is provided. The second plurality of data may include a graphical representation of the area of the event location, time duration of the program event, and/or a description of the program event. The graphical representation of the event may be processed and shown as illustrated in the event location window 150 of FIG. 1 in which an representative image of the location is shown. Further information may be provided including, e.g., GPS coordinates, number of user and content sources 151 - 153 at the event location. For example, a
representation of a football stadium may simply be an overhead image of the stadium, or it may be a graphical representation of the seating chart along with GPS coordinates circumscribing the stadium.
In step 410, In step 400 of FIG. 4, a third plurality of data, comprising e.g., the location, the time, video content, and audio content and/or various metadata are sent from a second device, located at some viewing point within the area of the event location, to a server 300 and received by the server 300.
In step 415, a program event being common between the first and second video contents is identified based upon, e.g., the time and location information included in the pluralities of first, second, and third pluralities of data.
In step 425, these common media assets are examined and arranged in such a way that the video and audio from each of the sources is
synchronized. This could allow, for example, a user to start playing the video and audio from the first device, then switch to playing the video from the first device while listening to the audio from the second device, without the video and audio falling out of synchronization, as described before.
In step 430, graphical representation is provided of the event location to a user, along with identifying viewing points on or near that representation, allowing the user to select viewing points on the user interface to select a video asset to view.
In step 435, a sponsor of an event could highlight or emphasize viewing points that match a particular device manufacturer. Alternatively, this could be from individuals linked to certain social media groups, or individuals who use a particular hashtag to annotate the video. In step 440, audio content with higher quality is utilized regardless of video content selection.
FIG 5 is another exemplary process according to the principles of the present invention. In step 500, a plurality of contents is received from a plurality of device at a location having an event. In step 505, a program event is automatically determined based on e.g., location information, timing information, video, audio, and/or annotation information from a plurality of devices.
In step 510, a graphical representation of the event location is provided on a user interface. Superimposed on this graphical representation are the locations of each available viewpoint or users providing their respective content at any point in time of an event. As an example, visible in FIG. 1 , in the upper left box of the user interface, is the superimposition of graphical representations representing different viewpoints V1 ; 151 , V 152 and V5 153 at an event location 150. This graphical representation could be displayed in a variety of ways, including a user interface of a device at the event, as noted in step 51 5, or accessed on a device after the event.
In step 520, a user may decide to select an image to represent their viewpoint at any time. As an example, instead of simply displaying a generic person symbol V1 ; the system might display a close-up image of the actual user's face from his or her selfie picture. Additionally, step 525 allows that a user may modify their graphical representation at any time. For example, a user could identify their mood via a representative emoticon or image, as it changes during a football game; excited in the beginning, happy when they score and upset when they eventually lose the game. In another
embodiment, the icon may represent the team logo of the opposing teams to identify the allegiance of the user. These graphical representations could be displayed on a device during the event, as seen in step 530, or at any other time that is desirable or useful. While several embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present embodiments. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings herein is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereof, the embodiments disclosed may be practiced otherwise than as specifically described and claimed. The present embodiments are directed to each individual feature, system, article, material and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials and/or methods, if such features, systems, articles, materials and/or methods are not mutually inconsistent, is included within the scope of the present
embodiments.

Claims

1 . A method comprising:
receiving (400) a first plurality of data from a first device, said first device located at a first viewing point within an area of location, said first plurality of data including first data, second data, third data, and fourth data, said first data indicating a location of said first viewing point, said second data indicating a first time, said third data including a first video content exhibiting a program event captured from said first viewing point at said first time, said fourth data including a first audio content captured at said first viewing point at said first time, said first audio content associated with said first video content;
providing (405) a second plurality of data, said second plurality of data including fifth data, sixth data, and seventh data, said fifth data indicating a graphical representation of said area of location, said sixth data indicating a time duration of said program event, said seventh data including a description of said program event;
receiving (410) a third plurality of data from a second device, said second device located at a second viewing point within said area of location, said third plurality of data including eighth data, ninth data, tenth data, and eleventh data, said eighth data indicating a location of said second viewing point, said ninth data indicating a second time, said tenth data including a second video content exhibiting said program event captured from said second viewing point at said second time, said eleventh data including a second audio content captured at said second viewing point at said second time, said second audio content associated with said second video content;
identifying (415) said program event being common between said first and second video contents based upon said time and location information included in said pluralities of first, second, and third pluralities of data;
forming (420) a group of contents for said program event, said group of contents comprising said first video and first audio contents and said second video and second audio contents; processing (425) said group of contents that a combination of said first video and said second audio contents is played in a synchronized manner; and providing (430) a graphical representation of said area of location, said graphical representation including user-selectable points representing respective ones of said first and second viewing points simultaneously on a user interface screen.
2. The method of claim 1 , further comprising:
identifying (105) a manufacturer of said first device based upon said first plurality of data received from said first device, said first plurality of data including a twelfth data, said twelfth data indicating said manufacturer of said first device;
identifying a sponsor of said event based upon said second plurality of data, said second plurality of data including a thirteenth data, said thirteen data indicating a sponsor of said program event; and emphasizing (435, 105) said first viewing point on said user-interface screen in response to a matching of said manufacturer and said sponsor.
3. The method of claim 1 , wherein:
a quality of said first audio content is better than a quality of said second audio content, said first audio content is selected (440) as a default audio content, regardless of a selection of one of said first and second video contents by a user.
4. The method of claim 1 , wherein said first data is determined by GPS data received at said first device.
5. The method of claim 1 , wherein said second data is adjusted by a reference data received at said first device.
6. An apparatus comprising: means (310) for receiving a first plurality of data from a first device and a third plurality of data from a second device, said first device located at a first viewing point within an area of location, said first plurality of data including first data, second data, third data, and fourth data, said first data indicating a location of said first viewing point, said second data indicating a first time, said third data including a first video content exhibiting a program event captured from said first viewing point at said first time, said fourth data including a first audio content captured at said first viewing point at said first time, said first audio content associated with said first video content; said second device located at a second viewing point within said area of location, said third plurality of data including eighth data, ninth data, tenth data, and eleventh data, said eighth data indicating a location of said second viewing point, said ninth data indicating a second time, said tenth data including a second video content exhibiting said program event captured from said second viewing point at said second time, said eleventh data including a second audio content captured at said second viewing point at said second time, said second audio content associated with said second video content;
means for providing (320) a second plurality of data, said second plurality of data including fifth data, sixth data, and seventh data, said fifth data indicating a graphical representation of said area of location, said sixth data indicating a time duration of said program event, said seventh data including a description of said program event;
means for identifying (320, 330) said program event being common between said first and second video contents based upon said time and location information included in said pluralities of first, second, and third pluralities of data;
means for forming (320, 340) a group of contents for said program event, said program event being common between said first and second video contents based upon said time and location information included in said pluralities of first, second, and third pluralities of data, said group of contents comprising said first video and first audio contents and said second video and second audio contents; and
means for providing (320, 330, 340) said group of contents in such a way that a combination of said first video and said second audio contents is played in a synchronized manner and a graphical representation of said area of location, said graphical representation including user-selectable points representing respective ones of said first and second viewing points simultaneously on a user interface screen.
7. The apparatus of claim 6, where in the means for identifying further identifies a manufacturer of said first device based upon said first plurality of data received from said first device, said first plurality of data including a twelfth data, said twelfth data indicating said manufacturer of said first device, and a sponsor of said event based upon said second plurality of data, said second plurality of data including a thirteenth data, said thirteen data indicating a sponsor of said program event.
8. The apparatus of claim 6, wherein a quality of said first audio content is better than a quality of said second audio content, said first audio content is selected as a default audio content, regardless of a selection of one of said first and second video contents by a user.
9. The apparatus of claim 6, wherein said first data is determined by GPS data received at said first device.
10. The apparatus of claim 6, wherein said second data is adjusted by a reference data received at said first device.
1 1 . An apparatus comprising:
a communication interface (310) for receiving a first plurality of data from a first device and a third plurality of data from a second device, said first device located at a first viewing point within an area of location, said first plurality of data including first data, second data, third data, and fourth data, said first data indicating a location of said first viewing point, said second data indicating a first time, said third data including a first video content exhibiting a program event captured from said first viewing point at said first time, said fourth data including a first audio content captured at said first viewing point at said first time, said first audio content associated with said first video content; said second device located at a second viewing point within said area of location, said third plurality of data including eighth data, ninth data, tenth data, and eleventh data, said eighth data indicating a location of said second viewing point, said ninth data indicating a second time, said tenth data including a second video content exhibiting said program event captured from said second viewing point at said second time, said eleventh data including a second audio content captured at said second viewing point at said second time, said second audio content associated with said second video content;
a processor (320) for providing a second plurality of data, said second plurality of data including fifth data, sixth data, and seventh data, said fifth data indicating a graphical representation of said area of location, said sixth data indicating a time duration of said program event, said seventh data including a description of said program event;
said processor (320) identifies said program event being common between said first and second video contents based upon said time and location information included in said pluralities of first, second, and third pluralities of data, and forms a group of contents for said program event, said program event being common between said first and second video contents based upon said time and location information included in said pluralities of first, second, and third pluralities of data, said group of contents comprising said first video and first audio contents and said second video and second audio contents; and said processor (320) provides said group of contents in such a way that a combination of said first video and said second audio contents is played in a synchronized manner and a graphical representation of said area of location, said graphical representation including user-selectable points representing respective ones of said first and second viewing points simultaneously on a user interface screen.
12. The apparatus of claim 1 1 , where in the processor further identifies a manufacturer of said first device based upon said first plurality of data received from said first device, said first plurality of data including a twelfth data, said twelfth data indicating said manufacturer of said first device, and a sponsor of said event based upon said second plurality of data, said second plurality of data including a thirteenth data, said thirteen data indicating a sponsor of said program event.
13. The apparatus of claim 1 1 , wherein a quality of said first audio content is better than a quality of said second audio content, said first audio content is selected as a default audio content, regardless of a selection of one of said first and second video contents by a user.
14. The apparatus of claim 1 1 , wherein said first data is determined by GPS data received at said first device.
15. The apparatus of claim 1 1 , wherein said second data is adjusted by a reference data received at said first device.
16. A method comprising:
receiving (500) a plurality of contents from a plurality of devices at a location having an event, determining (505) automatically the event based on location
information and timing information from said plurality of devices, and providing (510) a graphical representation of said location having said event, said graphical representation including superimposed said location information from said plurality of devices.
17. The method of claim 16 wherein said location information is based on GPS information.
18. The method of claim 17 wherein the event is automatically determined by common location information and timing information from said plurality of devices.
19. The method of claim 16 further comprising the step of providing said plurality of contents in real time.
20. The method of claim 19 further comprising the step providing selected one of said plurality of contents based on a manufacturer brand of a corresponding device.
21 . The method of claim 16 further comprising the step of displaying said graphical representation on at least one of said devices at said location having said event.
22. The method of claim 16 further comprising the step of displaying said graphical representation of a person associated with at least one of said devices at said location having said program event.
23. The method of claim 22 where said graphical representation of a person is selected in response to previous user action.
24. The method of claim 22 where said graphical representation of a person may be modified in real time in response to a user action.
25. An apparatus comprising:
a processor configured to receiving (500) a plurality of contents from a plurality of devices at a location having an event, a processor configured to determine (505) automatically the event based on location information and timing information from said plurality of devices, and a processor configured to provide (510) a graphical representation of said location having said event, said graphical representation including
superimposed said location information from said plurality of devices.
26. The apparatus of claim 25 wherein said location information is based on GPS information.
27. The apparatus of claim 26 wherein the event is automatically determined by common location information and timing information from said plurality of devices.
28. The apparatus of claim 25 further comprising a processor configured to provide said plurality of contents in real time.
29. The apparatus of claim 28 further comprising a processor configured to provide selected one of said plurality of contents based on a manufacturer brand of a corresponding device.
30. The apparatus of claim 25 further comprising a processor configured to display said graphical representation on at least one of said devices at said location having said event.
31 . The apparatus of claim 25 further comprising a processor configured to display said graphical representation of a person associated with at least one of said devices at said location having said program event.
32. The apparatus of claim 31 where said graphical representation of person is selected in response to previous user action.
33. The apparatus of claim 31 where said graphical representation of person may be modified in real time in response to a user action.
PCT/US2015/034661 2014-06-18 2015-06-08 Multiple viewpoints of an event generated from mobile devices WO2015195390A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462013622P 2014-06-18 2014-06-18
US62/013,622 2014-06-18

Publications (1)

Publication Number Publication Date
WO2015195390A1 true WO2015195390A1 (en) 2015-12-23

Family

ID=53487427

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/034661 WO2015195390A1 (en) 2014-06-18 2015-06-08 Multiple viewpoints of an event generated from mobile devices

Country Status (1)

Country Link
WO (1) WO2015195390A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018146442A1 (en) * 2017-02-07 2018-08-16 Tagmix Limited Event source content and remote content synchronization
CN108966027A (en) * 2018-08-15 2018-12-07 郑州云海信息技术有限公司 A kind of audio video synchronization back method and device
US11785276B2 (en) 2017-02-07 2023-10-10 Tagmix Limited Event source content and remote content synchronization

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090087161A1 (en) * 2007-09-28 2009-04-02 Graceenote, Inc. Synthesizing a presentation of a multimedia event
US20090148124A1 (en) * 2007-09-28 2009-06-11 Yahoo!, Inc. Distributed Automatic Recording of Live Event
WO2010068175A2 (en) * 2008-12-10 2010-06-17 Muvee Technologies Pte Ltd Creating a new video production by intercutting between multiple video clips
WO2010088515A1 (en) * 2009-01-30 2010-08-05 Priya Narasimhan Systems and methods for providing interactive video services
US20110029894A1 (en) * 2009-02-20 2011-02-03 Ira Eckstein System and method for communicating among users of a set group
US20120213404A1 (en) * 2011-02-18 2012-08-23 Google Inc. Automatic event recognition and cross-user photo clustering
US20130132836A1 (en) * 2011-11-21 2013-05-23 Verizon Patent And Licensing Inc. Methods and Systems for Presenting Media Content Generated by Attendees of a Live Event
US20140150042A1 (en) * 2012-11-29 2014-05-29 Kangaroo Media, Inc. Mobile device with location-based content

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090087161A1 (en) * 2007-09-28 2009-04-02 Graceenote, Inc. Synthesizing a presentation of a multimedia event
US20090148124A1 (en) * 2007-09-28 2009-06-11 Yahoo!, Inc. Distributed Automatic Recording of Live Event
WO2010068175A2 (en) * 2008-12-10 2010-06-17 Muvee Technologies Pte Ltd Creating a new video production by intercutting between multiple video clips
WO2010088515A1 (en) * 2009-01-30 2010-08-05 Priya Narasimhan Systems and methods for providing interactive video services
US20110029894A1 (en) * 2009-02-20 2011-02-03 Ira Eckstein System and method for communicating among users of a set group
US20120213404A1 (en) * 2011-02-18 2012-08-23 Google Inc. Automatic event recognition and cross-user photo clustering
US20130132836A1 (en) * 2011-11-21 2013-05-23 Verizon Patent And Licensing Inc. Methods and Systems for Presenting Media Content Generated by Attendees of a Live Event
US20140150042A1 (en) * 2012-11-29 2014-05-29 Kangaroo Media, Inc. Mobile device with location-based content

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018146442A1 (en) * 2017-02-07 2018-08-16 Tagmix Limited Event source content and remote content synchronization
CN110800307A (en) * 2017-02-07 2020-02-14 塔杰米克斯有限公司 Event source content and remote content synchronization
US11094349B2 (en) 2017-02-07 2021-08-17 Tagmix Limited Event source content and remote content synchronization
US11785276B2 (en) 2017-02-07 2023-10-10 Tagmix Limited Event source content and remote content synchronization
CN108966027A (en) * 2018-08-15 2018-12-07 郑州云海信息技术有限公司 A kind of audio video synchronization back method and device

Similar Documents

Publication Publication Date Title
US11582182B2 (en) Multi-user media presentation system
US11100953B2 (en) Automatic selection of audio and video segments to generate an audio and video clip
CN107852399B (en) Streaming media presentation system
US9009596B2 (en) Methods and systems for presenting media content generated by attendees of a live event
US9344606B2 (en) System and method for compiling and playing a multi-channel video
US8913171B2 (en) Methods and systems for dynamically presenting enhanced content during a presentation of a media content instance
US10887673B2 (en) Method and system for associating recorded videos with highlight and event tags to facilitate replay services
US20130259447A1 (en) Method and apparatus for user directed video editing
US20160180883A1 (en) Method and system for capturing, synchronizing, and editing video from a plurality of cameras in three-dimensional space
CN103842936A (en) Recording, editing and combining multiple live video clips and still photographs into a finished composition
US8943020B2 (en) Techniques for intelligent media show across multiple devices
WO2015195390A1 (en) Multiple viewpoints of an event generated from mobile devices
JP2020523686A (en) System and method for operating a streaming service that provides a community space for media content items

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15731440

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15731440

Country of ref document: EP

Kind code of ref document: A1