WO2011082083A2

WO2011082083A2 - Flick intel annotation methods and systems

Info

Publication number: WO2011082083A2
Application number: PCT/US2010/061888
Authority: WO
Inventors: Richard Krukar; Luis Ortiz; Kermit Lopez
Original assignee: Flick Intel, Llc
Priority date: 2009-12-31
Filing date: 2010-12-22
Publication date: 2011-07-07
Also published as: WO2011082083A3

Abstract

Video content is a time varying presentation of scenes or video frames. Each frame can contain a number of scene elements such as actors, foreground items, background items, or other items. A person enjoying video content can select a scene element by specifying a screen coordinate while the video content plays. Frame specification data identifies the specific frame or scene being displayed when the coordinate is selected. The coordinate in combination with the frame specification data is sufficient to identify the scene element that the person has chosen. Information about the scene element can then be presented to the person. An annotation database can relate the scene elements to the frame specification data and coordinates.

Description

FLICK INTEL ANNOTATION METHODS AND SYSTEMS

TECHNICAL FIELD

[0001] Embodiments relate to video content, video displays, and video compositing. Embodiments also relate to computer systems, user input devices, databases, and computer networks.

BACKGROUND OF THE INVENTION

[0002] People have watched video content on televisions and other audio-visual devices for decades. They have also used gaming systems, personal computers, handheld devices, and other devices to enjoy interactive content. They often have questions about places, people and things' appearing as the video content is displayed, and about the music they hear. Databases containing information about the content such as the actors in a scene or the music being played already exist and provide users with the ability to learn more.

[0003] The existing database solutions provide information about elements appearing in a movie or scene, but only in a very general way. A person curious about a scene element can obtain information about the scene and hope that the information mentions the scene element in which the person is interested. Systems and methods that provide people with the ability to select a specific scene element and to obtain information about only that element are needed.

BRIEF SUMMARY

[0004] The following summary is provided to facilitate an understanding of some of the innovative features unique to the embodiments and is not intended to be a full description. A full appreciation of the various aspects of the embodiments can be gained by taking the entire specification, claims, drawings, and abstract as a whole,

[0005] It is therefore an aspect of the embodiments that a media device can provide video content to a display device and that a person can view the video content as it is presented on the display device. A series of scenes or a time varying series of frames along with any audio dialog, music, or sound effects are examples of video content.

[0006] It is another aspect of the embodiments that the person can choose a coordinate on the display device. A coordinate can be chosen with a pointing device or any other form of user input by which the person can indicate a spot on the display device and select that spot. Frame specification data can be generated when the person chooses the coordinate. The frame specification data can identify a specific scene or frame within the video content.

[0007] It is yet another aspect of the embodiments to provide an element identifier based on the coordinate and the frame specification data. Element identifiers are uniquely associated with scene elements. The element identifier can be obtained by querying an annotation database that relates element identifiers to coordinates and frame specification data. The element identifier can also be provided by a human worker who views the scene or frame, looks to the coordinate, and reports what appears at that location.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The accompanying figures, in which like reference numerals refer to identical or functionally similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate aspects of the embodiments and, together with the background, brief summary, and detailed description serve to explain the principles of the embodiments.

[0009] Figure 1 illustrates element data being presented on a second display in response to the selection of a scene element on a first display in accordance with aspects of certain embodiments;

[0010] Figure 2 illustrates an annotation database providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments;

[0011] Figure 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments; and

[0012] Figure 4 illustrates an annotated content stream passing to a media device such that the media device produces element data in accordance with aspects of certain embodiments.

DETAILED DESCRIPTION

[0013] The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof. In general, the figures are not to scale.

[0014] The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout. As used herein, the term "and/or" includes any and ail combinations of one or more of the associated listed items.

[0015] Video content is a time varying presentation of scenes or video frames. Each frame can contain a number of scene elements such as actors, foreground items, background items, or other items. A person enjoying video content can select a scene element by specifying a screen coordinate while the video content plays. Frame specification data identifies the specific frame or scene being displayed when the coordinate is selected. The coordinate in combination with the frame specification data is sufficient to identify the scene element that the person has chosen. Information about the scene element can then be presented to the person. An annotation database can associate scene elements with frame specification data and coordinates.

[0016] Figure 1 illustrates element data being presented on a second display 1 19 in response to the selection of a scene element on a display 101 in accordance with aspects of certain embodiments. A media device 104 passes video content to the display 101 to be viewed by a person. The person can manipulate a selection device 1 12 to choose a coordinate 102 on a display device 101 , The coordinate can then be passed to a media device 104. In some embodiments the selection device can detect the coordinate 105. For example, the selection device 1 12 can detect the locations of emitters 106 and infer the screen position being pointed at from those emitter locations. In other embodiments the display 101 can detect the coordinate 103. For example, the selection device can emit a light beam that the display device detects. Other common coordinate selection means include mice, trackballs, and touch sensors. More advanced pointing means can observe the person's body or eyeballs to thereby determine a coordinate. Clicking a button or some other action can generate an event indicting that a scene element is chosen.

[0017] The media device 104 can generate a selection packet 107 that includes frame selection data and the coordinate 102. The frame selection data is data that is sufficient to identify a specific frame or scene. For example, the frame selection data can be a media tag 108 and a timestamp 109. The media tag 108 can identify a particular movie, show, sporting event, advertisement, video clip, scene or other unit of video content. A timestamp 109 specifies a time within the video content. In combination, a media tag and timestamp can specify a particular frame from amongst ail the frames of video content that have ever been produced.

[0018] The frame selection packet 107 can be formed into a query for an annotation database 1 1 1 . The annotation database 1 1 1 can contain associations of element identifiers associated with frame selection data and coordinates. As such, the annotation database 1 1 1 can produce an element identifier 1 13 in response to the query. The element identifier 1 13 can identify a person 1 14, an item 1 15, music 1 16, a place 1 17, or something else.

[0019] The element identifier 1 13 can then be passed to another server 1 18 that responds by producing element data for presentation to the person. Examples of element data include, but are not limited to: statistics on a person such as an athlete; a picture of a person, object or place; an offer for purchase of an item, service, or song; and links to other media in which a person, item, or place appears.

[0020] Figure 2 illustrates an annotation database 1 1 1 providing element identifiers 21 1 in response to a person selecting scene elements in accordance with aspects of the embodiments. An annotation service/module 202 can produce annotated content 203 by annotating content 201 . An annotation module is a device, algorithm, program, or other means that automatically annotates content. Image recognition algorithms can locate items within scenes and frames and thereby automatically provide annotation data. An annotation service is a service provider that annotates content. An annotation service provider can employ both human workers and annotation modules.

[0021] Annotation is a process wherein scene elements, each having an element identifier, are associated with media tags and space time ranges. A space time range identifies a range of times and positions at which a scene element appears. For example, a car can sit unmoving during an entire scene. The element identifier can specify the make, model, color, and trim level of the car, the media tag can identify a movie containing the scene, and the space time range can specify the time range of the movie scene and the location of the car within the scene.

[0022] The content 201 can be passed to a media device 104 that produces a media stream 207 for presentation on a display device 208. A person 205 watching the display device 206 can use a selection device 1 12 to choose a coordinate on the display device 206. A selection packet 107 containing the coordinate and some frame specification data can then be passed to the annotation database 1 1 1 which responds by identifying the scene element 21 1 . An additional data server 1 18 can produce element data 212 for that identified scene element 21 1 . The element data 212 can then be presented to the person.

[0023] Figure 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments. The embodiment of Fig. 3 differs from that of Fig. 2 in that the content 201 is not necessarily annotated before being viewed by the person 205. The selection packet 107 is passed to the annotation service 301 where a human worker 302 or annotation module 303 determines what scene element the person 205 selected and creates a new annotation entry for incorporation into the annotation database 1 1 1 .

[0024] Figure 4 illustrates an annotated content stream 401 passing to a media device 104 such that the media device 104 produces element data 407 in accordance with aspects of certain embodiments. Annotated content, such as annotated content 203 of Fig. 2, can be passed as an annotated content stream 401 to the media device 104. The annotated content stream 401 can include a content stream 402, element stream 403, and element data 406. The media device 104 can then pass the content for presentation on the display 208 and store the element data 406 and the data in the element stream 403, The data in the element stream 403 can be formed into an annotation database with the possible exception that no media tag is needed. No media tag is needed because all the annotations refer only to the content stream 402. As such, the element stream 403 is illustrated as containing only space time ranges 404 and element identifiers 405.

[0025] The media device 104, having assembled an annotation database and having stored element data 406, can produce element data 407 for a scene element selected by a person 205 without querying remote databases or accessing remote resources.

[0026] Note that in practice, the content stream 402, element stream 403, and element data 406 can be transferred separately or in combination as streaming data . eans for transferring content, annotations, and element data include TV signals and storage devices such as DVD disks or data disks. Furthermore, the element data 406 can be passed to the media device 104 or can be stored and accessed on a remote server.

[0027] It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also, that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.

[0028] The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof.

[0029] The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout, As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

[0030] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

[0031] Unless otherwise defined, ail terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

[0032] As will be appreciated by one skilled in the art, the present invention can be embodied as a method, data processing system, or computer program product. Accordingly, the present invention may take the form of an entire hardware embodiment, an entire software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a "circuit" or "module." Furthermore, the present invention may take the form of a computer program product on a computer-usable storage medium having computer-usable program code embodied in the medium. Any suitable computer readable medium may be utilized including hard disks, USB Flash Drives, DVDs, CD-ROMs, optical storage devices, magnetic storage devices, etc.

[0033] Computer program code for carrying out operations of the present invention may be written in an object oriented programming language (e.g., Java, C++, etc.). The computer program code, however, for carrying out operations of the present invention may also be written in conventional procedural programming languages such as the "C" programming language, in a visually oriented programming environment such as, for example, VisualBasic, or in functional programming languages such as LISP or Erlang.

[0034] The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer. In the latter scenario, the remote computer may be connected to a user's computer through a local area network (LAN) or a wide area network (WAN), wireless data network e.g., WiFi, Wimax, 802. xx, and cellular network or the connection may be made to an external computer via most third party supported networks (for example, through the Internet using an Internet Service Provider).

[0035] The invention is described in part below with reference to flowchart illustrations and/or block diagrams of methods, systems, computer program products, and data structures according to embodiments of the invention. It will be understood that each block of the illustrations, and combinations of blocks, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block or blocks.

[0036] These computer program instructions may also be stored in a computer- readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act specified in the block or blocks.

[0037] The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the block or blocks.

Claims

CLAIMS The embodiments of the invention in which an exclusive property or right is claimed are defined as follows. Having thus described the invention what is claimed is:

1 . A system comprising:

a display device that displays video content to a person wherein the video content comprises a plurality of frames;

a media device that provides the video content to the display device;

a pointing device that the person uses to choose a coordinate on the display device; and

an element identifier provided by an annotation database in response to a query comprising the coordinate and frame specification data wherein the frame specification data identifies which of the frames was displayed on the display device when the person chose the coordinate.

2. The system of claim 1 wherein the video content is a movie, show, advertisement, or sporting event.

3. The system of claim 2 wherein the frame specification data comprises a timestamp and a media tag wherein the media tag identifies the video content, wherein the timestamp identifies a frame of the video content and wherein the frame was displayed to the user when the user chose the coordinate.

4. The system of claim 3 further comprising:

an additional data server that produces element data based on the element identifier; and

a data presentation that displays the element data to the person.

5. The system of claim 4 wherein the element identifier corresponds to an item for sale and the data presentation comprises an offer for the user to purchase the item.

6. The system of claim 4 wherein the element identifier corresponds to a person and the data presentation provides information about that person.

7. The system of claim 4 wherein the element identifier corresponds to a song and the data presentation comprises an offer for the user to purchase a copy of the song.

8. The system of claim 4 wherein the element identifier corresponds to a location and the data presentation comprises travel information for reaching the location.

9. The system of claim 1 further comprising:

a data presentation that displays the element data to the person.

10. A system comprising:

a dispiay device that displays video content to a person wherein the video content comprises a plurality of frames;

a media device that provides the video content to the display device;

an element identifier provided by an annotation service in response to a query comprising the coordinate and frame specification data wherein the frame specification data identifies which of the frames was displayed on the display device when the person chose the coordinate.

1 1 . The system of claim 10 wherein the annotation service comprises an annotation database running on a remote server.

12. The system of claim 10 wherein the annotation service is a human worker who determines the element identifier by examining the frame and the coordinate.

13. The system of claim 12 further comprising an annotation database wherein the human worker causes the frame specification data, the coordinate, and the element identifier to be entered into the annotation database.

14. The system of claim 12 further comprising an additional data server that produces element data based on the element identifier and wherein the human worker causes the element data to be provided to the person.

15. The system of claim 12 further comprising a message that the human worker causes to be sent to the person.

16. A method comprising:

receiving a coordinate and frame specification data wherein the frame specification data identifies a frame of video content, wherein the coordinate specifies a location within the frame, and wherein a person selected the coordinate within the frame;

producing an element identifier corresponding to a scene element visible in the frame and located at the coordinate;

locating element data relating to the element identifier; and

presenting the element data to the person.

17. The method of claim 16 further comprising querying an annotation database to thereby obtain the element identifier.

18. The method of claim 17 wherein the element identifier specifies a purchasable item and wherein the element data comprises an offer for the user to buy the purchasable item.

19. The method of claim 17 wherein the scene element is a person.

20. The method of claim 16 wherein the element identifier specifies a purchasable item and wherein the element data comprises an offer for the user to buy the purchasable item.