EP1320992A2 - Method for highlighting important information in a video program using visual cues - Google Patents

Method for highlighting important information in a video program using visual cues

Info

Publication number
EP1320992A2
EP1320992A2 EP01971992A EP01971992A EP1320992A2 EP 1320992 A2 EP1320992 A2 EP 1320992A2 EP 01971992 A EP01971992 A EP 01971992A EP 01971992 A EP01971992 A EP 01971992A EP 1320992 A2 EP1320992 A2 EP 1320992A2
Authority
EP
European Patent Office
Prior art keywords
cue
video clip
preselected
frames
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01971992A
Other languages
German (de)
French (fr)
Inventor
Mohamed Abdel-Mottaleb
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of EP1320992A2 publication Critical patent/EP1320992A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7857Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using texture

Definitions

  • the present invention relates to content-based video retrieval and browsing, and niore particularly, to a method for automatically identifying important information or developments in video clips of sports events.
  • Video applications call for browsing methods which enable one to browse through a large amount of video material to find clips which are of a certain importance.
  • Such applications may include for example, interactive TV and pay-per-view systems.
  • Customers who use interactive TV and pay-per-view systems want to see sections of programs before renting them.
  • Video browsers enable the customers to find programs of interest.
  • low-level features such as color, texture, shape and camera motion.
  • low-level features can be useful for certain applications, many other interesting applications require the use of higher level semantic information. Bridging the gap between low-level features and high-level semantic information is not always easy. In most cases when higher level semantic information is required, manual annotation using keywords is always used.
  • One of the important applications for video archiving and retrieval is for sports such as soccer, football, etc. Accordingly, a method is needed which enables automatic extraction of high level information using low level features.
  • the present invention is directed to a method for automatically identifying important developments in video clips of sporting events, especially soccer matches.
  • the method comprises detecting sequences of frames in a video clip of a sporting event that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue; comparing the number of frames in each of the frame sequences having the cue to a predefined threshold number; and declaring an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
  • the method further involves acquiring the preselected cue from low level features in the image in each frame of the sequence.
  • the preselected cue is based on changes in the camera's center of attention. More particularly, when an important development occurs in the video clip, the camera typically focuses on the viewers or players, and thus, the images in the sequence of frames immediately subsequent to the frames with the important development have little or no grass areas.
  • Fig. 1 is a flowchart outlining an algorithm that performs an illustrative embodiment of the method of the present invention
  • Fig. 2 is a block diagram of a computer for implementing the present invention.
  • Fig. 3 is a block diagram of the internal structure of the computer for implementing the present invention.
  • the method of the present invention extracts high level information from multiple images or video using low level features in order to achieve advancements in content-based retrieval and browsing. This is accomplished in the present invention by specifying a particular domain of interest and using knowledge specific to that domain to automatically extract high level information based on low level features.
  • One especially useful application for the present invention is in highlighting segments of important developments in video clips of sports events, including but not limited to soccer matches and football games. Such video clips typically include video, audio, and textual (close- captioning) information.
  • the method of the present invention highlights important developments in a video clip by inferring the developments from one or more cues which are provided from low level features and textual information of the video clip. More particularly, the method detects sequences of frames in the video clip having a certain preselected visual, audible, and/or textual (close captioning) cue. The number of frames in each sequence having the cue(s) is then compared to a predefined threshold number. If the number of frames in a sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence with the cue. It has been found that important developments in video clips of sports events are typically marked with a visual cue which relates to changes in the camera's center of attention.
  • the video camera usually focuses on the stadium viewers or the players.
  • the camera When the camera focuses on the viewers or players, little or none of the grass of the playing field can be seen in the camera' s field of view.
  • the method of the present invention detects sequences of frames in the video clip with images that have little or no grass areas of the playing field.
  • the number of frames in each sequence is compared to a predefined threshold number. If the number of frames in the sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence that has little or no grass areas.
  • the threshold is based on the assumption that if the number of frames in the sequence with little or no grass areas of the playing field is significant, the camera must be focusing on the viewers or the players. Consequently, it is likely that the frames immediately preceding that sequence of frames includes an important development such as the scoring of a goal in the case of a soccer match.
  • Fig. 1 shows a flowchart which outlines an illustrative embodiment of an algorithm for performing the method of the present invention as it applies to highlighting segments of important events in a video clip of a soccer match.
  • the algorithm in step SI detects sequences of frames in the video clip in which there are little or no grass areas.
  • step 52 if the number of frames in the sequence is larger than a predefined threshold, then in step
  • the algorithm detects green areas which have colors similar to grass.
  • the algorithm is trained to differentiate the green colors from the other colors in each frame so that the grass areas in the frame can be identified. This is accomplished using patches from a training set of images of grass areas which have been extracted from the soccer match in the video clip, or from one or more previous soccer matches.
  • the algorithm learns from the patches how the grass areas translate into the values of the color green. Given an image in a frame of the video clip, the training is used to judge whether a given pixel in the frame is grass.
  • a color histogram of an image is obtained by dividing a color space, such as red, green, and blue, into discrete image colors (called bins) and counting the number of times each discrete color appears by traversing every pixel in the image.
  • This normalized histogram can be considered as the probability density function for the class grass, p(pixel value I grass).
  • the detection step SI is accomplish in the algorithm by marking pixels in each frame that have a value of p(pixel value
  • step S2 If only small grass color components are detected for a short period of time in step S2, for example in only one-three or four frames, then no important event is declared in step S3. However, if small grass color components are detected for a relatively long period of time, for example in 200-300 frames, then an important event is declared in step S3.
  • the results obtained with the algorithm can be further refined using other cues either from the same modality or from other modalities, such as audio or closed captions. Cues from the same modalities or different modalities can be used to confirm the identity of the detected important occurrences or activities and more importantly, to classify the detected important occurrences or activities into semantic classes, such as goals, attempted goals, penalties, injuries, fights between players and the like, and rank them by importance.
  • the method of the Fig.ure 1 is implemented by a computer readable code executed by a data processing apparatus.
  • the code may be stored in a memory within the data processing apparatus or read/downloaded from a memory medium such as a CD-ROM or floppy disk.
  • hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention.
  • the invention for example, can also be implemented on a computer 30 shown in Fig. 2.
  • the computer 30 may include a network connection 31 for interfacing to a data network, such as a variable-bandwidth network or the Internet, and a fax/modem connection 32 for interfacing with other remote sources such as a video or a digital camera (not shown).
  • the computer 30 may also include a display for displaying information (including video data) to a user, a keyboard for inputting text and user commands, a mouse for positioning a cursor on the display and for inputting user commands, a disk drive for reading from and writing to floppy disks installed therein, and a CD-ROM drive for accessing information stored on CD-ROM.
  • the computer 30 may also have one or more peripheral devices 38 attached thereto inputting images, or the like, and a printer for outputting images, text, or the like.
  • Fig. 3 shows the internal structure of the computer 30 which includes a memory 40 that may include a Random Access Memory (RAM), Read-Only Memory (ROM) and a computer-readable medium such as a hard disk.
  • the items stored in the memory 40 include an operating system 41, data 42 and applications 43.
  • the operating system 41 may be a windowing operating system, such as UNIX; although the invention may be used with other operating systems as well such as Microsoft Windows95.
  • the applications stored in the memory 40 include a video coder 44, a video decoder 45 and a frame grabber 46.
  • the video coder 44 encodes video data in a conventional manner
  • the video decoder 45 decodes video data which has been coded in the conventional manner.
  • the frame grabber 46 allows single frames from a video signal stream to be captured and processed.
  • the CPU 50 comprises a microprocessor or the like for executing computer readable code, i.e., applications, such those noted above, out of the memory 50.
  • applications may be stored in memory 40 (as noted above) or, alternatively, on a floppy disk in disk drive 36 or a CD-ROM in CD-ROM drive 37.
  • the CPU 50 accesses the applications (or other data) stored on a floppy disk via the memory interface 52 and accesses the applications (or other data) stored on a CD-ROM via CD-ROM drive interface 53.
  • Input video data may be received through the video interface 54 or the communication interface 51.
  • the input video data may be decoded by the video decoder 45.
  • Output video data may be coded by the video coder 44 for transmission through the video interface 54 or the communication interface 51.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A method for highlighting important developments in a video clip of a sports event, such as a video clip of a soccer match, by inferring the developments from a cue provided from low level features in the video clip. The method detects sequences of frames in the video clip having a certain preselected visual or audible cues. The number of frames in each sequence having the cue is then compared to a predefined threshold number. If the number of frames in a sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence.

Description

Method for highlighting important information in a video program using visual cues
The present invention relates to content-based video retrieval and browsing, and niore particularly, to a method for automatically identifying important information or developments in video clips of sports events.
Many video applications call for browsing methods which enable one to browse through a large amount of video material to find clips which are of a certain importance. Such applications may include for example, interactive TV and pay-per-view systems. Customers who use interactive TV and pay-per-view systems want to see sections of programs before renting them. Video browsers enable the customers to find programs of interest.
Most work in content-based video retrieval and browsing is based on low-level features such as color, texture, shape and camera motion. Although low-level features can be useful for certain applications, many other interesting applications require the use of higher level semantic information. Bridging the gap between low-level features and high-level semantic information is not always easy. In most cases when higher level semantic information is required, manual annotation using keywords is always used.
One of the important applications for video archiving and retrieval is for sports such as soccer, football, etc. Accordingly, a method is needed which enables automatic extraction of high level information using low level features.
The present invention is directed to a method for automatically identifying important developments in video clips of sporting events, especially soccer matches. The method comprises detecting sequences of frames in a video clip of a sporting event that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue; comparing the number of frames in each of the frame sequences having the cue to a predefined threshold number; and declaring an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
The method further involves acquiring the preselected cue from low level features in the image in each frame of the sequence. In such an embodiment, the preselected cue is based on changes in the camera's center of attention. More particularly, when an important development occurs in the video clip, the camera typically focuses on the viewers or players, and thus, the images in the sequence of frames immediately subsequent to the frames with the important development have little or no grass areas.
The advantages, nature, and various additional features of the invention will appear more fully upon consideration of the illustrative embodiments now to be described in detail in connection with accompanying drawings wherein:
Fig. 1 is a flowchart outlining an algorithm that performs an illustrative embodiment of the method of the present invention;
Fig. 2 is a block diagram of a computer for implementing the present invention; and
Fig. 3 is a block diagram of the internal structure of the computer for implementing the present invention.
The method of the present invention extracts high level information from multiple images or video using low level features in order to achieve advancements in content-based retrieval and browsing. This is accomplished in the present invention by specifying a particular domain of interest and using knowledge specific to that domain to automatically extract high level information based on low level features. One especially useful application for the present invention is in highlighting segments of important developments in video clips of sports events, including but not limited to soccer matches and football games. Such video clips typically include video, audio, and textual (close- captioning) information.
The method of the present invention highlights important developments in a video clip by inferring the developments from one or more cues which are provided from low level features and textual information of the video clip. More particularly, the method detects sequences of frames in the video clip having a certain preselected visual, audible, and/or textual (close captioning) cue. The number of frames in each sequence having the cue(s) is then compared to a predefined threshold number. If the number of frames in a sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence with the cue. It has been found that important developments in video clips of sports events are typically marked with a visual cue which relates to changes in the camera's center of attention. For example, after an important development has taken place in a sports event such as a soccer match, the video camera usually focuses on the stadium viewers or the players. When the camera focuses on the viewers or players, little or none of the grass of the playing field can be seen in the camera' s field of view.
Using changes in the camera's center of attention, the method of the present invention detects sequences of frames in the video clip with images that have little or no grass areas of the playing field. The number of frames in each sequence is compared to a predefined threshold number. If the number of frames in the sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence that has little or no grass areas. The threshold is based on the assumption that if the number of frames in the sequence with little or no grass areas of the playing field is significant, the camera must be focusing on the viewers or the players. Consequently, it is likely that the frames immediately preceding that sequence of frames includes an important development such as the scoring of a goal in the case of a soccer match.
Fig. 1 shows a flowchart which outlines an illustrative embodiment of an algorithm for performing the method of the present invention as it applies to highlighting segments of important events in a video clip of a soccer match. The algorithm in step SI detects sequences of frames in the video clip in which there are little or no grass areas. In step
52, if the number of frames in the sequence is larger than a predefined threshold, then in step
53, an important event is declared in the previous set of frames in the video clip.
In the detecting step SI, the algorithm detects green areas which have colors similar to grass. The algorithm is trained to differentiate the green colors from the other colors in each frame so that the grass areas in the frame can be identified. This is accomplished using patches from a training set of images of grass areas which have been extracted from the soccer match in the video clip, or from one or more previous soccer matches. The algorithm learns from the patches how the grass areas translate into the values of the color green. Given an image in a frame of the video clip, the training is used to judge whether a given pixel in the frame is grass.
The algorithm is trained by calculating red and green normalized colors (r,g), where: r = R/(R+G+B), g = G/(R+G+B) for each pixel in the training patches and obtaining the normalized histogram for the class grass. A color histogram of an image is obtained by dividing a color space, such as red, green, and blue, into discrete image colors (called bins) and counting the number of times each discrete color appears by traversing every pixel in the image.
This normalized histogram can be considered as the probability density function for the class grass, p(pixel value I grass). The detection step SI is accomplish in the algorithm by marking pixels in each frame that have a value of p(pixel value | grass) greater than a preselected threshold as pixels of grass. Based on the above pixel classification, the algorithm in step SI looks for connected components that have similar color to the grass in the image of each frame and if they are large enough, it is assumed that the camera is focusing on the playing field. If, however, the connected grass color components found in the image of the frame are small, then it is assumed that the camera is either focusing on the viewer or the players. If only small grass color components are detected for a short period of time in step S2, for example in only one-three or four frames, then no important event is declared in step S3. However, if small grass color components are detected for a relatively long period of time, for example in 200-300 frames, then an important event is declared in step S3.
The results obtained with the algorithm can be further refined using other cues either from the same modality or from other modalities, such as audio or closed captions. Cues from the same modalities or different modalities can be used to confirm the identity of the detected important occurrences or activities and more importantly, to classify the detected important occurrences or activities into semantic classes, such as goals, attempted goals, penalties, injuries, fights between players and the like, and rank them by importance. In one embodiment, the method of the Fig.ure 1 is implemented by a computer readable code executed by a data processing apparatus. The code may be stored in a memory within the data processing apparatus or read/downloaded from a memory medium such as a CD-ROM or floppy disk. In other embodiments, hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention. The invention, for example, can also be implemented on a computer 30 shown in Fig. 2.
The computer 30 may include a network connection 31 for interfacing to a data network, such as a variable-bandwidth network or the Internet, and a fax/modem connection 32 for interfacing with other remote sources such as a video or a digital camera (not shown). The computer 30 may also include a display for displaying information (including video data) to a user, a keyboard for inputting text and user commands, a mouse for positioning a cursor on the display and for inputting user commands, a disk drive for reading from and writing to floppy disks installed therein, and a CD-ROM drive for accessing information stored on CD-ROM. The computer 30 may also have one or more peripheral devices 38 attached thereto inputting images, or the like, and a printer for outputting images, text, or the like.
Fig. 3 shows the internal structure of the computer 30 which includes a memory 40 that may include a Random Access Memory (RAM), Read-Only Memory (ROM) and a computer-readable medium such as a hard disk. The items stored in the memory 40 include an operating system 41, data 42 and applications 43. The operating system 41 may be a windowing operating system, such as UNIX; although the invention may be used with other operating systems as well such as Microsoft Windows95.
In addition to the method of Fig. 1, the applications stored in the memory 40 include a video coder 44, a video decoder 45 and a frame grabber 46. The video coder 44 encodes video data in a conventional manner, and the video decoder 45 decodes video data which has been coded in the conventional manner. The frame grabber 46 allows single frames from a video signal stream to be captured and processed.
Also included in the computer 30 are a central processing unit (CPU) 50, a communication interface 51, a memory interface 52, a CD-ROM drive interface 53, a video interface 54 and a bus 55. The CPU 50 comprises a microprocessor or the like for executing computer readable code, i.e., applications, such those noted above, out of the memory 50. Such applications may be stored in memory 40 (as noted above) or, alternatively, on a floppy disk in disk drive 36 or a CD-ROM in CD-ROM drive 37. The CPU 50 accesses the applications (or other data) stored on a floppy disk via the memory interface 52 and accesses the applications (or other data) stored on a CD-ROM via CD-ROM drive interface 53. Input video data may be received through the video interface 54 or the communication interface 51. The input video data may be decoded by the video decoder 45. Output video data may be coded by the video coder 44 for transmission through the video interface 54 or the communication interface 51.
While the foregoing invention has been described with reference to the above embodiments, various modifications and changes can be made without departing from the spirit of the invention. Accordingly, all such modifications and changes are considered to be within the scope of the appended claims.

Claims

CLAIMS:
1. A method for automatically identifying important occurrences or activities in video clips of sporting events, the method comprising the steps of: a) providing a video clip of a sporting event generated by a camera; b) detecting sequences of frames in the video clip that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue; c) comparing the number of frames in each of the frame sequences having the cue to a predefined threshold number; d) declaring an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
2. The method according to claim 1, wherein preselected cue is visual.
3. The method according to claim 1, wherein the preselected cue is based on changes in the center of attention of the camera.
4. The method according to claim 1, wherein each frame in the sequence has an image and the preselected cue is acquired from the images.
5. The method according to claim 4, wherein the preselected cue includes the images having little or no grass areas.
6. The method according to claim 1 , wherein the sporting event shown in the video clip is a soccer match.
7. The method according to claim 1, wherein the preselected cue is provided from low level features of the video clip.
8. The method according to claim 1, wherein the preselected cue is provided from low level visual features of the video clip.
9. The method according to claim 8, wherein the low level visual features include color.
10. The method according to claim 1 , wherein the preselected cue is provided from low level audio features of the video clip.
11. The method according to claim 1 , wherein the preselected cue is provided from textual information of the video clip.
12. The method according to claim 11 , further comprising the step of confirming the identity of the important occurrences or activities using the textual information of the video clip.
13. The method according to claim 11 , further comprising the step of classifying the important occurrences or activities in semantic classes using the textual information of the video clip.
14. The method according to claim 1, wherein the preselected cue includes a plurality of preselected cues.
15. The method according to claim 1, wherein the preselected cues include low level visual and audio features of the video clip and textual information of the video clip.
16. An apparatus for automatically identifying important occurrences or activities in video clips of sporting events, comprising:
- a memory for storing executable code; and
- a processor for executing the code stored in the memory so as to (a) provide a video clip of a sporting event generated by a camera, (b) detect sequences of frames in the video clip that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue, (c) compare the number of frames in each of the frame sequences having the cue to a predefined threshold number and (d) declare an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
EP01971992A 2000-09-13 2001-08-30 Method for highlighting important information in a video program using visual cues Withdrawn EP1320992A2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US66091800A 2000-09-13 2000-09-13
US660918 2000-09-13
PCT/EP2001/010112 WO2002023891A2 (en) 2000-09-13 2001-08-30 Method for highlighting important information in a video program using visual cues

Publications (1)

Publication Number Publication Date
EP1320992A2 true EP1320992A2 (en) 2003-06-25

Family

ID=24651479

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01971992A Withdrawn EP1320992A2 (en) 2000-09-13 2001-08-30 Method for highlighting important information in a video program using visual cues

Country Status (3)

Country Link
EP (1) EP1320992A2 (en)
JP (1) JP2004509529A (en)
WO (1) WO2002023891A2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4577774B2 (en) * 2005-03-08 2010-11-10 Kddi株式会社 Sports video classification device and log generation device
US20100005485A1 (en) * 2005-12-19 2010-01-07 Agency For Science, Technology And Research Annotation of video footage and personalised video generation
US9047374B2 (en) 2007-06-08 2015-06-02 Apple Inc. Assembling video content
JP2011015129A (en) * 2009-07-01 2011-01-20 Mitsubishi Electric Corp Image quality adjusting device
JP6427902B2 (en) 2014-03-17 2018-11-28 富士通株式会社 Extraction program, method, and apparatus
JP6354229B2 (en) 2014-03-17 2018-07-11 富士通株式会社 Extraction program, method, and apparatus

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU719329B2 (en) * 1997-10-03 2000-05-04 Canon Kabushiki Kaisha Multi-media editing method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3728775B2 (en) * 1995-08-18 2005-12-21 株式会社日立製作所 Method and apparatus for detecting feature scene of moving image
KR100206804B1 (en) * 1996-08-29 1999-07-01 구자홍 The automatic selection recording method of highlight part
JPH1155613A (en) * 1997-07-30 1999-02-26 Hitachi Ltd Recording and/or reproducing device and recording medium using same device
WO1999045483A1 (en) * 1998-03-04 1999-09-10 The Trustees Of Columbia University In The City Of New York Method and system for generating semantic visual templates for image and video retrieval
US6163510A (en) * 1998-06-30 2000-12-19 International Business Machines Corporation Multimedia search and indexing system and method of operation using audio cues with signal thresholds

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU719329B2 (en) * 1997-10-03 2000-05-04 Canon Kabushiki Kaisha Multi-media editing method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHANG Y.-L. ET AL: "Integrated image and speech analysis for content-based video indexing", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIACOMPUTING AND SYSTEMS, 17 June 1996 (1996-06-17), LOS ALAMITOS, CA, US, pages 306 - 313 *
See also references of WO0223891A3 *

Also Published As

Publication number Publication date
WO2002023891A3 (en) 2002-05-30
JP2004509529A (en) 2004-03-25
WO2002023891A2 (en) 2002-03-21

Similar Documents

Publication Publication Date Title
US7339992B2 (en) System and method for extracting text captions from video and generating video summaries
Truong et al. Scene extraction in motion pictures
JP4643829B2 (en) System and method for analyzing video content using detected text in a video frame
US7120873B2 (en) Summarization of sumo video content
JP5420199B2 (en) Video analysis device, video analysis method, digest automatic creation system and highlight automatic extraction system
CN110381366B (en) Automatic event reporting method, system, server and storage medium
US8340498B1 (en) Extraction of text elements from video content
EP2089820B1 (en) Method and apparatus for generating a summary of a video data stream
WO2019007020A1 (en) Method and device for generating video summary
US8051446B1 (en) Method of creating a semantic video summary using information from secondary sources
Snoek et al. Time interval maximum entropy based event indexing in soccer video
EP1320992A2 (en) Method for highlighting important information in a video program using visual cues
Chen et al. Knowledge-based approach to video content classification
Choroś Highlights extraction in sports videos based on automatic posture and gesture recognition
US20070124678A1 (en) Method and apparatus for identifying the high level structure of a program
Brezeale Learning video preferences using visual features and closed captions
Jung et al. Player information extraction for semantic annotation in golf videos
Bailer et al. Skimming rushes video using retake detection
CN117221669B (en) Bullet screen generation method and device
US11417100B2 (en) Device and method of generating video synopsis of sports game
Lotfi A Novel Hybrid System Based on Fractal Coding for Soccer Retrieval from Video Database
Hsieh et al. Constructing a bowling information system with video content analysis
Gupta A Survey on Video Content Analysis
Brezeale et al. Learning video preferences from video content
Nitta Semantic content analysis of broadcasted sports videos with intermodal collaboration

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030414

AK Designated contracting states

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17Q First examination report despatched

Effective date: 20090312

APBK Appeal reference recorded

Free format text: ORIGINAL CODE: EPIDOSNREFNE

APBN Date of receipt of notice of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA2E

APBR Date of receipt of statement of grounds of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA3E

APAF Appeal reference modified

Free format text: ORIGINAL CODE: EPIDOSCREFNE

APBT Appeal procedure closed

Free format text: ORIGINAL CODE: EPIDOSNNOA9E

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: KONINKLIJKE PHILIPS N.V.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20130301