EP1320992A2 - Method for highlighting important information in a video program using visual cues - Google Patents
Method for highlighting important information in a video program using visual cuesInfo
- Publication number
- EP1320992A2 EP1320992A2 EP01971992A EP01971992A EP1320992A2 EP 1320992 A2 EP1320992 A2 EP 1320992A2 EP 01971992 A EP01971992 A EP 01971992A EP 01971992 A EP01971992 A EP 01971992A EP 1320992 A2 EP1320992 A2 EP 1320992A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- cue
- video clip
- preselected
- frames
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/785—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/7857—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using texture
Definitions
- the present invention relates to content-based video retrieval and browsing, and niore particularly, to a method for automatically identifying important information or developments in video clips of sports events.
- Video applications call for browsing methods which enable one to browse through a large amount of video material to find clips which are of a certain importance.
- Such applications may include for example, interactive TV and pay-per-view systems.
- Customers who use interactive TV and pay-per-view systems want to see sections of programs before renting them.
- Video browsers enable the customers to find programs of interest.
- low-level features such as color, texture, shape and camera motion.
- low-level features can be useful for certain applications, many other interesting applications require the use of higher level semantic information. Bridging the gap between low-level features and high-level semantic information is not always easy. In most cases when higher level semantic information is required, manual annotation using keywords is always used.
- One of the important applications for video archiving and retrieval is for sports such as soccer, football, etc. Accordingly, a method is needed which enables automatic extraction of high level information using low level features.
- the present invention is directed to a method for automatically identifying important developments in video clips of sporting events, especially soccer matches.
- the method comprises detecting sequences of frames in a video clip of a sporting event that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue; comparing the number of frames in each of the frame sequences having the cue to a predefined threshold number; and declaring an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
- the method further involves acquiring the preselected cue from low level features in the image in each frame of the sequence.
- the preselected cue is based on changes in the camera's center of attention. More particularly, when an important development occurs in the video clip, the camera typically focuses on the viewers or players, and thus, the images in the sequence of frames immediately subsequent to the frames with the important development have little or no grass areas.
- Fig. 1 is a flowchart outlining an algorithm that performs an illustrative embodiment of the method of the present invention
- Fig. 2 is a block diagram of a computer for implementing the present invention.
- Fig. 3 is a block diagram of the internal structure of the computer for implementing the present invention.
- the method of the present invention extracts high level information from multiple images or video using low level features in order to achieve advancements in content-based retrieval and browsing. This is accomplished in the present invention by specifying a particular domain of interest and using knowledge specific to that domain to automatically extract high level information based on low level features.
- One especially useful application for the present invention is in highlighting segments of important developments in video clips of sports events, including but not limited to soccer matches and football games. Such video clips typically include video, audio, and textual (close- captioning) information.
- the method of the present invention highlights important developments in a video clip by inferring the developments from one or more cues which are provided from low level features and textual information of the video clip. More particularly, the method detects sequences of frames in the video clip having a certain preselected visual, audible, and/or textual (close captioning) cue. The number of frames in each sequence having the cue(s) is then compared to a predefined threshold number. If the number of frames in a sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence with the cue. It has been found that important developments in video clips of sports events are typically marked with a visual cue which relates to changes in the camera's center of attention.
- the video camera usually focuses on the stadium viewers or the players.
- the camera When the camera focuses on the viewers or players, little or none of the grass of the playing field can be seen in the camera' s field of view.
- the method of the present invention detects sequences of frames in the video clip with images that have little or no grass areas of the playing field.
- the number of frames in each sequence is compared to a predefined threshold number. If the number of frames in the sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence that has little or no grass areas.
- the threshold is based on the assumption that if the number of frames in the sequence with little or no grass areas of the playing field is significant, the camera must be focusing on the viewers or the players. Consequently, it is likely that the frames immediately preceding that sequence of frames includes an important development such as the scoring of a goal in the case of a soccer match.
- Fig. 1 shows a flowchart which outlines an illustrative embodiment of an algorithm for performing the method of the present invention as it applies to highlighting segments of important events in a video clip of a soccer match.
- the algorithm in step SI detects sequences of frames in the video clip in which there are little or no grass areas.
- step 52 if the number of frames in the sequence is larger than a predefined threshold, then in step
- the algorithm detects green areas which have colors similar to grass.
- the algorithm is trained to differentiate the green colors from the other colors in each frame so that the grass areas in the frame can be identified. This is accomplished using patches from a training set of images of grass areas which have been extracted from the soccer match in the video clip, or from one or more previous soccer matches.
- the algorithm learns from the patches how the grass areas translate into the values of the color green. Given an image in a frame of the video clip, the training is used to judge whether a given pixel in the frame is grass.
- a color histogram of an image is obtained by dividing a color space, such as red, green, and blue, into discrete image colors (called bins) and counting the number of times each discrete color appears by traversing every pixel in the image.
- This normalized histogram can be considered as the probability density function for the class grass, p(pixel value I grass).
- the detection step SI is accomplish in the algorithm by marking pixels in each frame that have a value of p(pixel value
- step S2 If only small grass color components are detected for a short period of time in step S2, for example in only one-three or four frames, then no important event is declared in step S3. However, if small grass color components are detected for a relatively long period of time, for example in 200-300 frames, then an important event is declared in step S3.
- the results obtained with the algorithm can be further refined using other cues either from the same modality or from other modalities, such as audio or closed captions. Cues from the same modalities or different modalities can be used to confirm the identity of the detected important occurrences or activities and more importantly, to classify the detected important occurrences or activities into semantic classes, such as goals, attempted goals, penalties, injuries, fights between players and the like, and rank them by importance.
- the method of the Fig.ure 1 is implemented by a computer readable code executed by a data processing apparatus.
- the code may be stored in a memory within the data processing apparatus or read/downloaded from a memory medium such as a CD-ROM or floppy disk.
- hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention.
- the invention for example, can also be implemented on a computer 30 shown in Fig. 2.
- the computer 30 may include a network connection 31 for interfacing to a data network, such as a variable-bandwidth network or the Internet, and a fax/modem connection 32 for interfacing with other remote sources such as a video or a digital camera (not shown).
- the computer 30 may also include a display for displaying information (including video data) to a user, a keyboard for inputting text and user commands, a mouse for positioning a cursor on the display and for inputting user commands, a disk drive for reading from and writing to floppy disks installed therein, and a CD-ROM drive for accessing information stored on CD-ROM.
- the computer 30 may also have one or more peripheral devices 38 attached thereto inputting images, or the like, and a printer for outputting images, text, or the like.
- Fig. 3 shows the internal structure of the computer 30 which includes a memory 40 that may include a Random Access Memory (RAM), Read-Only Memory (ROM) and a computer-readable medium such as a hard disk.
- the items stored in the memory 40 include an operating system 41, data 42 and applications 43.
- the operating system 41 may be a windowing operating system, such as UNIX; although the invention may be used with other operating systems as well such as Microsoft Windows95.
- the applications stored in the memory 40 include a video coder 44, a video decoder 45 and a frame grabber 46.
- the video coder 44 encodes video data in a conventional manner
- the video decoder 45 decodes video data which has been coded in the conventional manner.
- the frame grabber 46 allows single frames from a video signal stream to be captured and processed.
- the CPU 50 comprises a microprocessor or the like for executing computer readable code, i.e., applications, such those noted above, out of the memory 50.
- applications may be stored in memory 40 (as noted above) or, alternatively, on a floppy disk in disk drive 36 or a CD-ROM in CD-ROM drive 37.
- the CPU 50 accesses the applications (or other data) stored on a floppy disk via the memory interface 52 and accesses the applications (or other data) stored on a CD-ROM via CD-ROM drive interface 53.
- Input video data may be received through the video interface 54 or the communication interface 51.
- the input video data may be decoded by the video decoder 45.
- Output video data may be coded by the video coder 44 for transmission through the video interface 54 or the communication interface 51.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Television Signal Processing For Recording (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US66091800A | 2000-09-13 | 2000-09-13 | |
US660918 | 2000-09-13 | ||
PCT/EP2001/010112 WO2002023891A2 (en) | 2000-09-13 | 2001-08-30 | Method for highlighting important information in a video program using visual cues |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1320992A2 true EP1320992A2 (en) | 2003-06-25 |
Family
ID=24651479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP01971992A Withdrawn EP1320992A2 (en) | 2000-09-13 | 2001-08-30 | Method for highlighting important information in a video program using visual cues |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1320992A2 (en) |
JP (1) | JP2004509529A (en) |
WO (1) | WO2002023891A2 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4577774B2 (en) * | 2005-03-08 | 2010-11-10 | Kddi株式会社 | Sports video classification device and log generation device |
WO2007073347A1 (en) * | 2005-12-19 | 2007-06-28 | Agency For Science, Technology And Research | Annotation of video footage and personalised video generation |
US9047374B2 (en) | 2007-06-08 | 2015-06-02 | Apple Inc. | Assembling video content |
JP2011015129A (en) * | 2009-07-01 | 2011-01-20 | Mitsubishi Electric Corp | Image quality adjusting device |
JP6354229B2 (en) | 2014-03-17 | 2018-07-11 | 富士通株式会社 | Extraction program, method, and apparatus |
JP6427902B2 (en) | 2014-03-17 | 2018-11-28 | 富士通株式会社 | Extraction program, method, and apparatus |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU719329B2 (en) * | 1997-10-03 | 2000-05-04 | Canon Kabushiki Kaisha | Multi-media editing method and apparatus |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3728775B2 (en) * | 1995-08-18 | 2005-12-21 | 株式会社日立製作所 | Method and apparatus for detecting feature scene of moving image |
KR100206804B1 (en) * | 1996-08-29 | 1999-07-01 | 구자홍 | The automatic selection recording method of highlight part |
JPH1155613A (en) * | 1997-07-30 | 1999-02-26 | Hitachi Ltd | Recording and/or reproducing device and recording medium using same device |
WO1999045483A1 (en) * | 1998-03-04 | 1999-09-10 | The Trustees Of Columbia University In The City Of New York | Method and system for generating semantic visual templates for image and video retrieval |
US6163510A (en) * | 1998-06-30 | 2000-12-19 | International Business Machines Corporation | Multimedia search and indexing system and method of operation using audio cues with signal thresholds |
-
2001
- 2001-08-30 JP JP2002527199A patent/JP2004509529A/en active Pending
- 2001-08-30 EP EP01971992A patent/EP1320992A2/en not_active Withdrawn
- 2001-08-30 WO PCT/EP2001/010112 patent/WO2002023891A2/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU719329B2 (en) * | 1997-10-03 | 2000-05-04 | Canon Kabushiki Kaisha | Multi-media editing method and apparatus |
Non-Patent Citations (2)
Title |
---|
CHANG Y.-L. ET AL: "Integrated image and speech analysis for content-based video indexing", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIACOMPUTING AND SYSTEMS, 17 June 1996 (1996-06-17), LOS ALAMITOS, CA, US, pages 306 - 313 * |
See also references of WO0223891A3 * |
Also Published As
Publication number | Publication date |
---|---|
JP2004509529A (en) | 2004-03-25 |
WO2002023891A3 (en) | 2002-05-30 |
WO2002023891A2 (en) | 2002-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7339992B2 (en) | System and method for extracting text captions from video and generating video summaries | |
Truong et al. | Scene extraction in motion pictures | |
US8028234B2 (en) | Summarization of sumo video content | |
JP5420199B2 (en) | Video analysis device, video analysis method, digest automatic creation system and highlight automatic extraction system | |
CN110381366B (en) | Automatic event reporting method, system, server and storage medium | |
US8340498B1 (en) | Extraction of text elements from video content | |
EP2089820B1 (en) | Method and apparatus for generating a summary of a video data stream | |
JP6557592B2 (en) | Video scene division apparatus and video scene division program | |
WO2019007020A1 (en) | Method and device for generating video summary | |
US8051446B1 (en) | Method of creating a semantic video summary using information from secondary sources | |
EP1320992A2 (en) | Method for highlighting important information in a video program using visual cues | |
Chen et al. | Knowledge-based approach to video content classification | |
Choroś | Highlights extraction in sports videos based on automatic posture and gesture recognition | |
US20070124678A1 (en) | Method and apparatus for identifying the high level structure of a program | |
Jung et al. | Player information extraction for semantic annotation in golf videos | |
Brezeale | Learning video preferences using visual features and closed captions | |
Lotfi | A Novel Hybrid System Based on Fractal Coding for Soccer Retrieval from Video Database | |
Bailer et al. | Skimming rushes video using retake detection | |
US11417100B2 (en) | Device and method of generating video synopsis of sports game | |
Hsieh et al. | Constructing a bowling information system with video content analysis | |
Gupta | A Survey on Video Content Analysis | |
Gao et al. | A study of intelligent video indexing system | |
Brezeale et al. | Learning video preferences from video content | |
Nitta | Semantic content analysis of broadcasted sports videos with intermodal collaboration | |
CN117221669A (en) | Bullet screen generation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20030414 |
|
AK | Designated contracting states |
Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK RO SI |
|
17Q | First examination report despatched |
Effective date: 20090312 |
|
APBK | Appeal reference recorded |
Free format text: ORIGINAL CODE: EPIDOSNREFNE |
|
APBN | Date of receipt of notice of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA2E |
|
APBR | Date of receipt of statement of grounds of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA3E |
|
APAF | Appeal reference modified |
Free format text: ORIGINAL CODE: EPIDOSCREFNE |
|
APBT | Appeal procedure closed |
Free format text: ORIGINAL CODE: EPIDOSNNOA9E |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: KONINKLIJKE PHILIPS N.V. |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20130301 |