WO2001015015A1 - Video data structure for video browsing based on content - Google Patents

Video data structure for video browsing based on content Download PDF

Info

Publication number
WO2001015015A1
WO2001015015A1 PCT/KR2000/000967 KR0000967W WO0115015A1 WO 2001015015 A1 WO2001015015 A1 WO 2001015015A1 KR 0000967 W KR0000967 W KR 0000967W WO 0115015 A1 WO0115015 A1 WO 0115015A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
event
relation
segment
information
Prior art date
Application number
PCT/KR2000/000967
Other languages
French (fr)
Inventor
Jung Min Song
Jin Soo Lee
Original Assignee
Lg Electronics Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lg Electronics Inc. filed Critical Lg Electronics Inc.
Priority to AU67383/00A priority Critical patent/AU6738300A/en
Publication of WO2001015015A1 publication Critical patent/WO2001015015A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/745Browsing; Visualisation therefor the internal structure of a single video sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/81Indexing, e.g. XML tags; Data structures therefor; Storage structures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/25Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with scene description coding, e.g. binary format for scenes [BIFS] compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/92Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
    • H04N5/926Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback by pulse code modulation

Definitions

  • various video data may be represented or classified into format chunk, index chunk, media chunk, segment chunk, target chunk, and/or representation chunk.
  • data on various characters or objects such as a name of an object, position on the screen, numeric data with relation to a segment of the video data m which the object appears, may be represented by the target and representation chunk. Accordingly, a user can select an object through a table and reproduce for display a particular segment where the object is shown m the video .
  • information such as the degree of violence, the degree of adult contents, the degree of importance of contents, characters positions, and the degree of difficulty m understanding may be indicated for each segment of a video in the video map.
  • the user may set a degree of preference for one or more items of the video map, and only segments of the video meeting the set degree of preference would be reproduced, thereby limiting a display of particular contents to unauthorized viewers.
  • Other techniques m the related art which allow users to selectively view a portion of a video include a temporal relational graph of shots for a video. However, viewing a temporal relational graph is similar to viewing several representative scenes of a video and would not allow a user to easily follow the contents of a video.
  • Another object of the present invention is to provide a video data structure for a video browsing based on content which allows users to easily browse and understand contents of a video.
  • Another object of the present invention is to provide a video data structure for video browsing based on content which allows users to fully understand relations between characters of a video by representing relations which vary according to the developments of events n the video.
  • a video data structure for video browsing based on content comprises objects and events; characters and events as objects; places used as backgrounds of objects; characters, places and events as objects; and a plot of a video content.
  • the content of a video is expressed based on relations between factors, including relations which change during the development of a movie or drama.
  • a user can browse and view a segment of an event corresponding to a selected relation.
  • a video data structure for video browsing based on content centering on relations between factors includes data representing an actual segment of a video; data representing the video content; and connecting data which indicates data representing an actual segment of the video by connecting factors to the data representing the video content.
  • a video data structure for video browsing based on content includes semantic data including objects and relations between the objects and places where the objects appear, and data for connecting corresponding video segments to the relations between the objects and the places.
  • semantic data including objects and relations between the objects and places where the objects appear, and data for connecting corresponding video segments to the relations between the objects and the places.
  • the video browser as shown m Fig. 1 may be implemented by a video data structure as shown in Fig. 2, which represents video data based primarily upon relations between characters.
  • a video browser as shown in Fig. 1, may be implemented. Namely, the characters in a video are displayed on the character screen 101 based upon the object DS 207. If 'character 1' is selected from the character screen 101, other characters related to 'character 1' and changes in the relations are displayed by a tree structure in the relation screen 102 based upon entity relation 212 data, relation 213 data, type 214 data, name 215 data, and Reference to Object 216 data under the event/object relation graph DS 208.
  • Fig. 3 is another example screen of a video story browser which can be implemented using the present invention.
  • Fig. 4 is another video indexing method in accordance with the present invention for a video browser, and can be used as a video data structure to implement a video browser based on an object-place relation graph as shown in Fig. 3.
  • the syntactic structure DS 702 is organized into actual video segments in a segment DS 704 and corresponding temporal positions of each video segments m a time DS 705.
  • the object DS 707 is divided into a Reference to Segment 711 including information for connecting characters to corresponding event segments, if the object is a character, an object type 712 including information which describes the type of each corresponding objects, and an annotation DS 713 including information which summarizes each corresponding objects in relation to the Reference to Segment 711 and object type 712.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A video data structure for video browsing based on content is disclosed. The present video data structure supports a video browsing which allows users to clearly understand relations between characters and changes in relations between characters as a plot of a movie or drama develops.

Description

VIDEO DATA STRUCTURE FOR VIDEO BROWSING BASED ON CONTENT
BACKGROUND OF THE INVENTION Field of the Invention
The present invention relates to a video browsing system, and more particularly to a video data structure for video browsing based on content, m which contents of a video may be summarized and browsed by defining relations among characters, places and events. Background of the Related Art Typically, users simply view movies and/or dramas as broadcasted through a TV or played at a movie theater. However, a user may wish to view a particular movie or drama at a particular time, or wish to view only a particular section of a movie or a drama. Accordingly, various techniques which enables a selective watching of a movie/drama or sections of a movie/drama have been suggested.
In the related art, for example, various video data may be represented or classified into format chunk, index chunk, media chunk, segment chunk, target chunk, and/or representation chunk. Also, data on various characters or objects such as a name of an object, position on the screen, numeric data with relation to a segment of the video data m which the object appears, may be represented by the target and representation chunk. Accordingly, a user can select an object through a table and reproduce for display a particular segment where the object is shown m the video .
In another related art, various additional data of a video data are obtained before, during or after the production of the video data. Thereafter, an additional information table of the obtained data is composed and provided to users . Namely, the additional data table may include a position where an actor appears, a position where a character of the actor appears, and a position where stage properties appear, such that a scene can be reproduced as selected by a user through the additional data table. For example, if a user selects a stage property, information on the selected stage property such as the manufacturer and price may be displayed on a screen, and the user may be able connect with the manufacturer or a seller of the stage property through a network connection. In still another related art, recording information on each segment of a video m a video map has been suggested. That is, information such as the degree of violence, the degree of adult contents, the degree of importance of contents, characters positions, and the degree of difficulty m understanding may be indicated for each segment of a video in the video map. Thus, the user may set a degree of preference for one or more items of the video map, and only segments of the video meeting the set degree of preference would be reproduced, thereby limiting a display of particular contents to unauthorized viewers. Other techniques m the related art which allow users to selectively view a portion of a video include a temporal relational graph of shots for a video. However, viewing a temporal relational graph is similar to viewing several representative scenes of a video and would not allow a user to easily follow the contents of a video. Similarly, other techniques in the related art as described above provide items simply arranged without any relation to the objects appearing in the movie or drama, based upon the selection of the user. However, the contents of a movie or drama generally builds around relations between characters, places and events. For example, relations between characters may not change from beginning to the end of the story or may continuously vary. Moreover, since one or more characters relate to a specific character in the movie or drama, the browsing method in the related art substantially fails to provide an accurate understanding of the story of the movie or drama to the user.
Therefore, techniques in the related arts have disadvantages in that it is impossible to understand a video centering on relations among characters according to the development of events, changes of relations, and relations among characters and places as events develop.
SUMMARY OF THE INVENTION
Accordingly, an object of the present invention is to solve at least the problems and disadvantages of the related art.
Another object of the present invention is to provide a video data structure for a video browsing based on content which allows users to easily browse and understand contents of a video. Another object of the present invention is to provide a video data structure for video browsing based on content which allows users to fully understand relations between characters of a video by representing relations which vary according to the developments of events n the video.
A further object of the present invention is to provide a video data structure for video browsing based content which allows users to easily recognize a whole content of a video by summarizing the content.
A still further object of the present invention is to provide a video data structure for video browsing based content which allows users to watch a portion of a video based upon a particular event. Additional advantages, objects, and features of the invention will be set forth in part m the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and advantages of the invention may be realized and attained as particularly pointed out in the appended claims.
To achieve the objects and in accordance with the purposes of the invention, as embodied and broadly described herein, a video data structure for video browsing based on content comprises objects and events; characters and events as objects; places used as backgrounds of objects; characters, places and events as objects; and a plot of a video content. Here, the content of a video is expressed based on relations between factors, including relations which change during the development of a movie or drama. Thus, a user can browse and view a segment of an event corresponding to a selected relation.
In another embodiment of the present invention, a video data structure for video browsing based on content centering on relations between factors includes data representing an actual segment of a video; data representing the video content; and connecting data which indicates data representing an actual segment of the video by connecting factors to the data representing the video content.
In still another embodiment of the present invention, a video data structure for video browsing based on content includes a semantic data including relations between characters and changes in the relations; and data which connects relations between characters with video segments corresponding to the changes in the relations. Thus, a user can browse a video by selecting relations and changes in the relations between characters.
In a further embodiment of the present invention, a video data structure for video browsing based on content includes semantic data including objects and relations between the objects and places where the objects appear, and data for connecting corresponding video segments to the relations between the objects and the places. Thus, a user can browse a video and view a segment of an event corresponding relation between objects and places .
In still a further embodiment of the present invention, a video data structure for video browsing based on content, the whole content of a video is expressed by a set of paragraphs corresponding to events as a factor for constituting the video. Thus, a user can browse and view a content of an event corresponding to an event paragraph.
BRIEF DESCRIPTION OF THE DRAWINGS The invention will be described in detail with reference to the following drawings m which like reference numerals refer to like elements wherein:
Fig. 1 is a view of a video browser supported by a data structure according to the present invention; Fig. 2 shows a video data structure m which video data is represented centering on relations among characters according to the present invention;
Fig. 3 is a view of a video story browser supported by a data structure according to the present invention; Fig. 4 is a video data structure for representing a video in an object-place relation graph according to a first embodiment of the present invention;
Fig. 5 is a video data structure for representing a video m an object-place relation graph according to another embodiment of the present invention; Fig. 6 is a video data structure for video browsing based on segments of an event after representing a content of a video by a set of event paragraphs according to the present invention;
Fig. 7 is a video data structure which includes video summary data according to the present invention; and
Fig. 8 is a video data structure for video browsing based on content centering on relations between characters according to a the present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS Generally, the present video data structure of relations and changes m relations between objects m a video allows a user to easily understand and browse a video. Fig. 1 shows an example screen of a video browser which is based on relations between objects and is supported by a video data structure m accordance with the present invention. Particularly, a video browser based on contents is disclosed m copendmg U.S. Patent Application Serial No. ? and 09/239,531, and is fully incorporated herein.
Referring to Fig. 1, a screen of character relations, a screen of main scenes, and a main screen are displayed. Although an object may be any person, item or place that appears in a video, for purposes of explanation, objects will be assumed to be characters and places in a video.
The screen of character relations includes a character screen 101 which represents the characters of a video, and a relation screen 102 which represents relations and changes in the relations between a character selected in the character screen 101 and other characters of the video. Here, a constant relation between the selected character and other characters are shown in an upper level of a relation tree structure, while changes in relations are shown in a lower level of the relation tree structure. Also, a main scene screen 103 displays key frames of events which show constant relations or changes in relations between characters as selected from the relation screen 102, and a main screen 104 displays a video segment corresponding to a scene of an event selected from the main scene screen 103.
In the present invention, a constant relation means either a relation between characters that cannot change throughout a video, such as a patent to child relation, or a relation which is most representative of the relations between characters. Also, the displayed constant relation may include additional information such as a number of variable relations in the lower tree structure. For example, in the displayed constant relation between 'character 1' and 'character 2, ' the number '2' displayed above 'character 2' indicates that there are two variable relations in the tree structure.
In Fig. 1, for example, a user selected 'character 1' from the character screen 101, and relations with 'character 2' and 'character 3' are shown in the relation screen 102 by a tree structure. Also, since the user selected relation 2 with 'character 2' from the relation screen 102, scenes of significant events corresponding to relation 2 between 'character 1' and 'character 2' are shown by key frames m the main scene screen 103. In the present invention, a significant event may mean an event which show a corresponding relation or an event which brought about a change in a corresponding relation. Finally, 'scene 10' is selected from the main scene screen 103 and the video segment corresponding to 'scene 10' is displayed in the main screen 104.
The video browser as shown m Fig. 1 may be implemented by a video data structure as shown in Fig. 2, which represents video data based primarily upon relations between characters.
Referring to Fig. 2, a video data structure is a visual description scheme of a video and begins from a visual DS 201. The visual DS 201 is divided into a syntactic structure DS 202 and a semantic structure DS 203. The syntactic structure DS 202 includes data information corresponding to actual segments of a video while the semantic structure DS 203 includes additional data information describing each segments of a video.
The syntactic structure DS 202 is organized into actual video segments in a segment DS 204 and corresponding temporal positions of each video segments m a time DS 205.
The semantic structure DS 203 is divided into event information m an event DS 206; object information in an object
DS 207; and relation information, such as constant relations or variable relations between characters or between characters and events in an event/object relation graph DS 208.
Particularly, the event DS 206 is organized into a Reference to Segment 209 including information necessary for displaying a video segment corresponding to an event, when a user selects an event, and an annotation DS 210 including text information which connects events with actual positions of the events in a video and information for explaining events m a video.
The object DS 207 is organized into an object annotation DS 211 including information for describing objects such as characters or places.
The event/object relation graph 208 is organized into an entity relation 212 with a return which allows a tree structure display of relations between characters. According to the present invention, to realize the character relation screen of Fig. 1, constant relations between characters are displayed m the top level of the tree while changes of relations between characters are displayed m a lower level of the tree.
The entity relation 212 is divided into a relation 213, Reference to Object 216, and Reference to Event. The relation 213 is organized into a type 214 including information on the nature of relations, and a name 215 including information on the titles of relations. For example, a nature of relation may be 'family' and a title of relation may be 'spouse. ' The Reference to Object 216 connects related characters with each other and the Reference to Event 217 connects events which show particular relations between related characters . In the above video data structure, the notation above each data such as {0,1}, {0,*}, or {1,*} indicates the number of data for the corresponding data. For example, the notation of {0,1} for the syntactic structure DS 202 indicates that the visual DS 101 can have zero or one syntactic structure DS . On the other hand, the notation of {0,*} for the segment DS 204 indicates that the syntactic structure DS 202 may have from zero to any number of segment DS .
Using a video data structure as described above, a video browser, as shown in Fig. 1, may be implemented. Namely, the characters in a video are displayed on the character screen 101 based upon the object DS 207. If 'character 1' is selected from the character screen 101, other characters related to 'character 1' and changes in the relations are displayed by a tree structure in the relation screen 102 based upon entity relation 212 data, relation 213 data, type 214 data, name 215 data, and Reference to Object 216 data under the event/object relation graph DS 208.
At this time, if 'relation 2' with 'character 2' is selected from the relation screen 102, scenes of events corresponding to relation 2 between 'character 1' and 'character 2' are shown by key frames in the main scene screen 103 based upon event DS 206 data and Reference to Event 217 data. If 'scene 10' is selected from the main scene screen 103, the video segment corresponding to 'scene 10' is displayed m the mam screen 104 based upon the Reference to Segment 209 data, the segment DS 204 data, and the time DS 205 data. Fig. 3 is another example screen of a video story browser which can be implemented using the present invention. The video story browser in Fig. 3 is based upon object-place relations, wherein a plot of a video is summarized by event paragraphs and information of a video segment corresponding to a displayed event paragraph is expressed by a relation graph of objects. Referring to Fig. 3, a relation screen 301 displays a relation graph between characters and places for a video segment corresponding to an event paragraph displayed m a story screen 302, where the event paragraph was selected by a user from a plot of video composed by a set of event paragraphs. Also, contents of video segments corresponding to a relation selected from the relation screen 301 is displayed by key frames with a brief explanation m a text screen 303.
In the video browser of Fig. 3, characters are displayed in a left column while places are displayed m a right column of the relation screen 301, wherein lines connecting the characters with the places indicate that the connected characters and places have relations. Also, information on at least one event scene corresponding to a relation selected from the relation screen 301 is displayed n the text screen 303 by key frames and text. Fig. 4 is another video indexing method in accordance with the present invention for a video browser, and can be used as a video data structure to implement a video browser based on an object-place relation graph as shown in Fig. 3.
Referring to Fig. 4, a video structure 401 is divided into a syntactic structure 402 and a semantic structure 403. The syntactic structure 402 is organized into a segment set 404 including actual video segments and for each video segment 405, a corresponding time 406 data and shot 407, such that an actual video segment can be displayed. Here, a shot is information of a representative frame. The semantic structure 403 is organized into an event 408 information of events and an object 409 information of characters. The event 408 is divided into a Reference to Segment 410 including information for connecting an event segment to a corresponding actual video segment, and an annotation 411 including information which summarizes each corresponding events. The object 409 is divided into a Reference to Segment 412 including information for connecting characters to corresponding event, if the object is a character, an object type 413 including information which describes the type of each corresponding objects, and an annotation 414 including information which summarizes each corresponding objects in relation to the Reference to Segment 412 and object type 413.
In the above video data structure, a return structure of the segment set 404, the event 408, and the object 409 enables a browser to display relations between objects by a tree structure.
Also, the object type 413 is data information which distinguishes whether an object is a character or a place, and to implement a video browser as shown in Fig. 3, the Reference to Segment 410, 412 including event and object information is used to connect objects such as characters and places to actual video segments corresponding to event segments. Namely, when an object and a place refers to a same event segment, the object and place have a relation. Therefore, if a relation between an object and a place is selected, a video segment corresponding to an event segment showing the selected object-place relation would be displayed using information in segment 404, 405, 406, 407 connected by the Reference to Segment 412 in object 409 and the Reference to Segment 410 in event 408.
Fig. 5 is another video indexing method in accordance with the present invention for a video browser based on an object-place relation graph as shown in Fig. 3. The video data structure of Fig. 5 includes a video structure 501 and is divided into a syntactic structure 502 and a semantic structure 503 as in the video data structure described with reference to Fig. 4.
Namely, the syntactic structure 502 is organized into a segment set 504, a video segment 505, time 506, and shot 507. The semantic structure 503 is organized into an event 508 and an object 509, where the event 508 is divided into a Reference to Segment 510 and an annotation 511, and the object 509 is divided into a Reference to Segment 512, an object type 513, and an annotation 514. Moreover, as in Fig. 4, the object type 513 is data for discriminating whether an object is a character or a place, and the segment set 504, the event 508, and the object 509 allows a tree structure display of with information of different degree of detail using a return structure. However, the video data structure of Fig. 5 further includes an additional event/object relation graph DS 515 for a more efficient connection relation between objects and places. Particularly, the event/object relation graph DS 515 is organized into an entity relation 516 including information which connects related characters by a Reference to Object 517 and information which connects events to corresponding relations by a Reference to Event 518, thereby displaying connections between related characters, places, and events.
As described above, the connection relations between characters, places, and events are processed and displayed more efficiently than the video data structure of Fig. 4. Also, the video data structure of Fig. 5 is more advantageous than the structure of Fig. 4 because names of a relation between an object and place can be identified. That is, if an object-place relation browsing is performed using the video data structure as shown in Fig. 5, and if an object and a place are selected, a video segment corresponding to an event which shows the selected object-place relation would be displayed using information in segment 504, 505, 506, 507. At this time, the Reference to Segment 510 and the Reference to Event 512 connects with the actual video segments in segment 504 ~ 507 through the Reference to Object 517 and the Reference to Event 518.
However, since the event/object relation graph DS 515 is additionally included, the efficiency with respect to the storage space, i.e. data volume, for a video browser is lower than the structure of Fig. 4.
Fig. 6 is another video indexing method in accordance with the present invention for a video browser which displays a plot of a video in event segments by event paragraphs. The video data structure of Fig. 6 includes a video structure 601 and is divided into a syntactic structure 602 and a semantic structure 603 as in the video data structure described with reference to Fig. 4. Accordingly, the syntactic structure 602 is organized into a segment set 504 including actual video segments and for each video segment 605, a corresponding time 606 and shot 607, such that an actual video segment can be displayed. However, the semantic structure 603 is organized into an event 608 information which is divided into a Reference to Segment 609 including information for connecting an event segment to a corresponding actual video segment, and an annotation 409 including information which summarizes each corresponding event segments. In the above video data structure, the segment set 604 and the event 608 enables a tree structure display of object relations using a return structure. Particularly, the event 608 allows a tree structure display, wherein a set of event summaries in a top level composes a plot of a video and each of the event summaries which is generally presented by one paragraph is connected to corresponding a video segment.
Thus, in the video data structure as shown in Fig. 6, a paragraph corresponding to a video segment selected from a plot of a video, and the video segment corresponding to the selected paragraph can be displayed in various style using an additional data structure 611. Such additional data structure can be provided for video data browsing based on content as described with reference to the relation screen 301 and text screen 303 of Fig. 3.
Fig. 7 is another video indexing method in accordance with the present invention for a video browser, which represents the additional data structure 611 of Fig. 6 by a connection graph between related characters and places such that events corresponding to each relation can be viewed.
Referring to Fig. 7, a video DS 701 is divided into a syntactic structure DS 702 including actual segments of video and a semantic structure DS 703 including additional data information describing each segments of a video.
The syntactic structure DS 702 is organized into actual video segments in a segment DS 704 and corresponding temporal positions of each video segments m a time DS 705.
The semantic structure DS 703 is organized into event information m an event DS 706; object information, such as characters or places, in an object DS 707; and relation information, such as constant relations between characters, variable relations between characters, or between characters and events, in an event/object relation graph DS 708.
The event DS 706 is divided into a Reference to Segment 709 including information necessary for displaying a video segment corresponding to an event, when a user selects an event, and an annotation DS 710 including information which connects events with actual positions of the events in a video and information for explaining events in a video.
The segment DS 704 and the event DS 706 are organized by two levels, wherein level 0 includes a set of event segments which defines a video, and level 1 includes information of an event segment which corresponds to a selected relation between a character and place.
The object DS 707 is divided into a Reference to Segment 711 including information for connecting characters to corresponding event segments, if the object is a character, an object type 712 including information which describes the type of each corresponding objects, and an annotation DS 713 including information which summarizes each corresponding objects in relation to the Reference to Segment 711 and object type 712.
Also, the event/object relation graph DS 708 is organized into an entity relation 714 including information which connects related characters by a Reference to Object 715 and information which connects events to corresponding relations by a Reference to Event 716.
Therefore, according to the video data structure as shown in Fig. 7, if one of a paragraph composing a plot of a video is selected and displayed on the story screen 302 of a video browser in Fig. 3, event segments of level 0 corresponding to the selected paragraph would be selected. At this time, a graph of characters and places are displayed using the objects connected to the displayed event segments. If a relation is selected from the character-place graph, an event segment of level 1 corresponding to the selected relation would be selected. Here, a correspondence means an event segment of level 1 which is commonly connected with the character and place.
Fig. 8 is another video indexing method in accordance with the present invention for a video browser based on constant relations and changes in relations between characters according to the development of events. Also, objects and places are represented by an object-place relation graph and event segments satisfying each of the object-place relation can be displayed. Furthermore, a plot of video can be expressed by a set of event paragraphs, and a video browsing can be performed based upon the content with relation to the video segments divided by the respective paragraphs.
The video data structure of Fig. 8 includes a visual DS 801 and is divided into a syntactic structure 802 and a semantic structure 803 as in the video data structure described with reference to Fig. 7.
Namely, the syntactic structure 802 is organized into a segment DS 804 and time DS 805. The semantic structure DS 803 is organized into an event DS 806, an object DS 807 and an event/object relation graph DS 808, where the event DS 806 is divided into a Reference to Segment 809 and an annotation 810 and the object DS 807 is divided into a Reference to Segment 811, an object type 812, and an annotation DS 813. Moreover, as in Fig. 7, the segment DS 804, the event DS
806, and the object DS 807 enables a tree structure display with information of different degree of detail using a return structure, and the object type 812 is data for discriminating whether an object is a character or a place. Furthermore, a return of an entity relation 814 of the event/object relation DS 808 allows a tree structure display with subordinate relations. For example, constant relations between characters are placed m an upper level of a tree and changes in relations between characters are placed m a lower level of the tree.
However, m the video data structure of Fig. 8, the entity relation 814 is divided into a relation 815, Reference to object 818, and reference to event 819. The relation 815 under the entity relation 814 is further divided into a type 816 including information on the nature of relations, and a name 817 including information on the titles of relations. For example, a nature of relation may be 'family' and a title of relation may be 'spouse' as discussed above. As in Fig. 7, the Reference to Object 818 connects related characters and a Reference to Event 819 connects characters with events corresponding to a particular relation.
As described above, according to the present invention, video browsing may be performed based upon characters relations in a movie or a drama, such that a user may easily reproduce and watch a section related to a desired plot of the movie or the drama. Also, a video according to the present invention may be expressed by a relation graph between objects, places and events, such that video indexing and browsing method based on the video content may be implemented. Moreover, a video according to the present invention may also be expressed by a set of event paragraphs, and a video indexing and browsing based on content may be performed per segment represented by respective paragraphs, such that a more clear and effective video indexing and browsing may be achieved.
The present invention may be applied to a VOD system m a broadcasting field for transmitting a particular portion to a user, such that only a selected video segment may be rapidly reproduced, thereby allowing an effective utilization of the network source. Finally, the present invention may be applied to domestic or broadcasting video reproducers for convenient video browsing of a desired segment of various video sources such as recorded movies, dramas, or sports games.
The foregoing embodiments are merely exemplary and are not to be construed as limiting the present invention. The present teachings can be readily applied to other types of apparatuses. The description of the present invention is intended to be illustrative, and not to limit the scope of the claims. Many alternatives, modifications, and variations will be apparent to those skilled in the art.

Claims

WHAT IS CLAIMED IS:
1. A video data structure for a video browser based on content comprising: a syntactic structure DS which includes data information corresponding to actual video segments of a video; and a semantic structure DS which includes additional data information describing each actual video segments of a video, wherein said semantic structure includes at least one event which includes event information and at least one object DS which includes object information.
2. A video data structure of claim 1, wherein the syntactic structure DS is organized into at least one segment DS including actual video segment data and a time DS for each segment DS including temporal positions of a corresponding actual video segment data within a video data.
3. A video data structure of claim 2, wherein each segment DS has a return structure.
4. A video data structure of claim 2, wherein the semantic structure DS further comprises at least one event/relation graph DS which includes information on one of a constant relation between objects, a variable relation between objects, or a relation between an object and event.
5. A video data structure of claim 4, wherein each event DS and each object DS respectively have return structures.
6. A video data structure of claim 4, wherein each event DS comprises : at least one reference to segment which includes reference information necessary for displaying a video segment of a video corresponding to events selected by a user; and an annotation DS which includes information which connects said selected events with actual positions of said selected events within a video data and information explaining said selected events.
7. A video data structure of claim 4, wherein each object DS comprises: at least one Reference to Segment which includes reference information for connecting objects in a video to corresponding events; an object type which includes information which describes a type of each object; and an annotation which includes summary information of each objects.
8. A video data structure of claim 4, wherein each event/object relation graph DS is organized into an entity relation comprising: at least a Reference to Object which connects objects having either a constant relation or variable relation; and at least one Reference to Event which connects events which are significant to a relation between each connected objects.
9. A video data structure of claim 8, wherein each entity relation further includes a relation comprising: a type which includes information on a type of relation between each connected objects; and a name which includes information on a title of relation between each connected objects.
10. A video data structure of claim 1, wherein the syntactic structure is organized into at least one segment which includes actual video segment data, and for each segment data, a corresponding time data and shot.
11. A video data structure of claim 10, wherein each segment has a return structure.
12. A video data structure of claim 10, wherein the semantic structure DS is organized into an event DS comprising: at least one Reference to Segment which includes information for connecting an event segment to a corresponding actual video segment data; and an annotation which includes information which summarizes each corresponding events.
13. A video data structure of claim 12, wherein the semantic structure DS further includes an object comprising: at least one a Reference to Segment which includes reference information for connecting objects m a video to corresponding events; an object type which includes information which describes a type of each object; and an annotation which includes summary information of each objects.
14. A video data structure of claim 13, wherein the semantic structure DS further includes a event/object relation graph DS comprising: at least one Reference to Object which connects objects having either a constant relation or variable relation; and at least one Reference to Event which connects events which are significant to a relation between each connected objects.
15. A video data structure of claim 14, wherein each event, each object, and each entity relation respectively have return structures.
16. A video data structure for video browsing based on content, comprising: a syntactic structure DS which is organized into at least one segment DS including actual video segment data and a time DS for each segment DS including temporal positions of corresponding actual video segment data within a video data; and a semantic structure DS which includes additional data information describing each actual video segments of a video, wherein the semantic structure DS is organized into at least one event DS including event information, at least one object DS including object information, and at least one event/relation graph DS including information on one of a constant relation between objects, a variable relation between objects, or a relation between an object and event.
17. A video data structure of claim 16, wherein each event DS comprises: at least one Reference to Segment including reference information necessary for displaying a video segment of a video corresponding to events selected by a user; and an annotation DS including information which connects said selected events with actual positions of said selected events within a video data and information explaining said selected events .
18. A video data structure of claim 17, wherein each object DS comprises: at least one Reference to Segment which includes reference information for connecting objects m a video to corresponding events; an object type which includes information which describes a type of each object; and an annotation which includes summary information of each obj ects .
19. A video data structure for video browsing comprising: a syntactic structure DS which includes data information corresponding to actual video segments of a video; and a semantic structure DS which includes additional data information describing each actual video segments of the video, wherein said semantic structure includes a plurality of event data, each including an event description data for representing an event by text, and wherein said semantic structure includes data for linking to the syntactic structure DS; wherein the video is divided and browsed based on events represented by each of said plurality of event data.
20. A video data structure of claim 19, wherein the semantic structure DS further comprises of one or a combination of at least one object data for describing an object, at least one place data describing a place, at least one event/object relation graph data representing a relation of an object, a place and an event; and wherein the video is divided and browsed based on events represented by each of said plurality of event data.
PCT/KR2000/000967 1999-08-26 2000-08-25 Video data structure for video browsing based on content WO2001015015A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU67383/00A AU6738300A (en) 1999-08-26 2000-08-25 Video data structure for video browsing based on content

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1999/35687 1999-08-26
KR10-1999-0035687A KR100518846B1 (en) 1999-08-26 1999-08-26 Video data construction method for video browsing based on content

Publications (1)

Publication Number Publication Date
WO2001015015A1 true WO2001015015A1 (en) 2001-03-01

Family

ID=19608820

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2000/000967 WO2001015015A1 (en) 1999-08-26 2000-08-25 Video data structure for video browsing based on content

Country Status (3)

Country Link
KR (1) KR100518846B1 (en)
AU (1) AU6738300A (en)
WO (1) WO2001015015A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2374508A (en) * 2001-04-02 2002-10-16 Cedemo Ltd Integrated display of information on a screen

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100392257B1 (en) * 2001-02-12 2003-07-22 한국전자통신연구원 A Method of Summarizing Sports Video Based on Visual Features
KR101274820B1 (en) * 2011-02-24 2013-06-13 한국방송공사 Apparatus for providing multimedia broadcast services relate to location information and object information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0719046A2 (en) * 1994-11-29 1996-06-26 Siemens Corporate Research, Inc. Method and apparatus for video data management
US5586316A (en) * 1993-07-09 1996-12-17 Hitachi, Ltd. System and method for information retrieval with scaled down image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5708767A (en) * 1995-02-03 1998-01-13 The Trustees Of Princeton University Method and apparatus for video browsing based on content and structure
KR100347710B1 (en) * 1998-12-05 2002-10-25 엘지전자주식회사 Method and data structure for video browsing based on relation graph of characters
KR100368324B1 (en) * 1999-06-23 2003-01-24 한국전자통신연구원 A apparatus of searching with semantic information in video and method therefor
KR20010004808A (en) * 1999-06-29 2001-01-15 박웅규 Video Indexing Method for Semantic Search
KR100370247B1 (en) * 1999-08-26 2003-01-29 엘지전자 주식회사 Video browser based on character relation
KR100319158B1 (en) * 1999-08-26 2001-12-29 구자홍 Video browsing system based on event

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5586316A (en) * 1993-07-09 1996-12-17 Hitachi, Ltd. System and method for information retrieval with scaled down image
EP0719046A2 (en) * 1994-11-29 1996-06-26 Siemens Corporate Research, Inc. Method and apparatus for video data management

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2374508A (en) * 2001-04-02 2002-10-16 Cedemo Ltd Integrated display of information on a screen

Also Published As

Publication number Publication date
KR100518846B1 (en) 2005-09-30
KR20010019341A (en) 2001-03-15
AU6738300A (en) 2001-03-19

Similar Documents

Publication Publication Date Title
US20060075361A1 (en) Video browser based on character relation
KR100350787B1 (en) Multimedia browser based on user profile having ordering preference of searching item of multimedia data
US6602297B1 (en) Motional video browsing data structure and browsing method therefor
US8762850B2 (en) Methods systems, and products for providing substitute content
US7698720B2 (en) Content blocking
US6732369B1 (en) Systems and methods for contextually linking television program information
US7031596B2 (en) Digital video reproduction method, digital video reproducing apparatus and digital video recording and reproducing apparatus
US8615782B2 (en) System and methods for linking television viewers with advertisers and broadcasters
US20050220439A1 (en) Interactive multimedia system and method
US20030122861A1 (en) Method, interface and apparatus for video browsing
US20050033849A1 (en) Content blocking
JP2006101526A (en) Method of summarizing hierarchical video utilizing synthetic key frame and video browsing interface
JP2004517532A (en) Embedding object-based product information in audiovisual programs that is reusable for non-intrusive and viewer-centric use
JP2005522112A (en) Method and system for providing supplemental information for video programs
US11070883B2 (en) System and method for providing a list of video-on-demand programs
WO2001027876A1 (en) Video summary description scheme and method and system of video summary description data generation for efficient overview and browsing
WO2001015017A1 (en) Video data structure for video browsing based on event
US20090183202A1 (en) Method and apparatus to display program information
WO2001015015A1 (en) Video data structure for video browsing based on content
JP2001134614A (en) Operable system for providing descriptive frame work and method for providing outline of av contents
EP1726160A2 (en) Interactive multimedia system and method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP