WO2005083592A1 - Method and apparatus for locating content in a program - Google Patents

Method and apparatus for locating content in a program Download PDF

Info

Publication number
WO2005083592A1
WO2005083592A1 PCT/IB2005/050415 IB2005050415W WO2005083592A1 WO 2005083592 A1 WO2005083592 A1 WO 2005083592A1 IB 2005050415 W IB2005050415 W IB 2005050415W WO 2005083592 A1 WO2005083592 A1 WO 2005083592A1
Authority
WO
WIPO (PCT)
Prior art keywords
word symbol
information
stream
user
program
Prior art date
Application number
PCT/IB2005/050415
Other languages
French (fr)
Inventor
Xin Chen
Yongqin Zeng
Ningjiang Chen
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to JP2007500321A priority Critical patent/JP2007525900A/en
Priority to EP05702854A priority patent/EP1723555A1/en
Publication of WO2005083592A1 publication Critical patent/WO2005083592A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play

Definitions

  • the present invention relates to a method and apparatus for locating program contents , particularly to a method and apparatus for locating according to the contents of the multimedia programs.
  • a multimedia program In addition to a video stream and an audio stream , a multimedia program generally contains an image stream and/or a text stream, these streams are synchronized with each other according to particular rules and predetermined time sequence for users to enjoy.
  • the synchronized multimedia integration language (SMIL) is a popular editing language.
  • the SMIL can not only integrate the respective content streams of a multimedia program in time sequence , but also be used to manage the layout of the multimedia program being presented. While watching a multimedia program, a user sometimes needs to find a particular segment of the program.
  • the multimedia playing apparatus should be able to automatically matching analysis the contents of the video streams, so that when the Sydney Opera Theater appears, the related segment is presented to the user.
  • content locating as described above , if a user performs location manually, he/she has to perform repeatedly the search before finding the desired position of the segment, which would be time- consuming and bothersome.
  • the editing tools provide only a very limited number of titles for users to choose from , which restricts the arbitrary of the user's choices, and renders the user-based choice impossible. Therefore, a new program content locating method and apparatus is needed, which enables users to locate program contents in multimedia programs conveniently so that their individual requirements could be satisfied by obtaining any segments as they want.
  • one of the objects of the present invention is providing a new program content locating method and apparatus to overcome the defects of the prior art, which enables users to locate program contents in multimedia programs conveniently to obtain the particular segments as they want.
  • the present invention provides a method for locating content in a multimedia program, which comprising a stream with word symbol information, comprising: firstly receiving a request comprising a specific word symbol from a user; then determining a position where the specific word symbol appears in the stream with word symbol information; and finally determining other presentable information synchronous with the word symbol information at the position.
  • the other presentable information may be video information or audio information.
  • the word symbol information may exist in a text format or image format.
  • the locating method further comprises the step of obtaining the text information corresponding to the word symbol information.
  • the stream provided with word symbol information may have a layered structure. If so, the locating method further comprises the step of determining a layer containing the position where specific word symbol appear and having a particular starting position and a particular end position, so that the other finally determined presentable information has the corresponding start position and end position.
  • the present invention further provides an apparatus for locating contents in a multimedia program, which has a stream provided with word symbol information.
  • the word symbol information may exist in a text format or an image format.
  • the apparatus includes a request receiving means, a word symbol locating means and a synch-locating means.
  • the request receiving means is used to receive a request comprising a specific word symbol from a user; the word symbol locating means is used to determine the position where the specific word symbol appear in the stream provided with word symbol information; and the synch-locating means is used to determine the other presentable information that synchronizes with the word symbol information appearing at the position.
  • the other presentable information may be video information or audio information.
  • the present invention locates the position of a user required segment in a program by analyzing the stream provided with word symbol information which is included in multimedia programs, then finds corresponding video or audio segment according to the synchronization rules.
  • the streams provided with word symbol information such as text streams or image streams, contain much less a quantity of data relative to video or audio, and the analysis of text is also much simpler than that of picture or audio; therefore , the present invention has greatly reduced the complexity of searching program contents, lowered the hardware requirement , made user's operation convenient, and satisfied different needs of individual users.
  • word symbol information such as text streams or image streams
  • Figure 1 is a system block diagram of an apparatus for locating contents in a multimedia program according to an embodiment of the present invention
  • Figure 2 is a flow chart of the process for locating contents in a multimedia program according to an embodiment of the invention
  • Figure 3 is a flow chart of the process for locating contents in a multimedia program and extracting particular segments according to another embodiment of the present invention
  • the same reference numbers indicate similar or identical features and functions.
  • Figure 1 shows a system block diagram of an apparatus for locating contents in multimedia programs according to an embodiment of the present invention.
  • the apparatus 100 may be part of a multimedia program making apparatus (not shown in this figure) or a multimedia playing apparatus (not shown in this figure).
  • the apparatus 100 includes a request receiving module 120, a text locating module 130 and a synch-locating module 140.
  • the apparatus 100 further includes a content receiving module 110, a presentation module 150 and an extraction module 160.
  • Said module included in the apparatus 100 can be realized by those skilled in the art by the various existing module as long as their combination can perform the functions of the present invention.
  • the content receiving module 110 is used to receive a multimedia program, which contains a stream provided with word symbol information, such as a text stream or an image stream having the word symbol information (as slides of the auxiliary demonstration tools in existing multimedia demonstration programs, e.g., one page of a PowerPoint file, sometimes transmitted in an image format).
  • the multimedia program may come from a local storage module (not shown in the figure), such as a DVD, or from a web server (not shown in the figure).
  • Request receiving module 120 is used to receive a request, which contains specific word symbol, such as, "Sydney Opera Theater". A user hopes to find with this request in the segment on Sydney Opera Theater in the multimedia program being edited/appreciated.
  • the multimedia program includes a stream provided with word symbol information.
  • Text locating module 130 is used to determine the position of specific word symbol in the multimedia program.
  • Module 130 searches the specific word symbol, such as "Sydney Opera Theater", in the stream provided with word symbol information, and, after the specific word symbol are found, obtains the information on their positions in the programs. If the stream provided with word symbol information is an image stream, the module 130 is further used to obtain the text information corresponding to the word symbol information in the image stream.
  • Synch-locating module 140 is used to determine the other presentable information that synchronizes with the word symbol information appearing at the site.
  • FIG. 150 is used to present to the user the program contents in the particular position within a multimedia program.
  • Extracting module 160 is used to extract a particular segment from a multimedia program. In this embodiment , the particular segment may contain the particular text information.
  • Figure 2 is a flow chart of a process for locating contents in a multimedia program according to an embodiment of the present invention.
  • a multimedia program including a stream provided with word symbol information is obtained in step 210 (S210).
  • the word symbol information exists in a text format, for example, in the case of a multimedia digital television program stream, the captions exist in the data stream in a text format; in the case of a multimedia demonstration program stream , the wording contents for the demonstration exist in a text stream in a text format. If the multimedia program is relatively long, this step will not end until the entire locating process ends.
  • the multimedia program about Australia Scenery is taken as an example.
  • the program includes a text stream carrying corresponding commentary contents.
  • a request containing specific word symbol such as "Sydney Opera Theater" is received from a user (S230) ; the user expects that specific word symbol exist at certain position in the text stream and hopes to find the segment containing the specific word symbol in the multimedia program obtained in S210.
  • the specific word symbol are searched in the text stream and it is judged whether they have been found appeared at a particular site in the text stream (S230). If they have not been found, then the process informs the user that the specific word symbol are not found in this multimedia program (S234) , and the entire process comes to an end. If they have been found, the process obtains the position information about the site where they appeared (S238 ) , for example, that of the "Sydney Opera
  • the corresponding position of the specific word symbol in the video stream is determined on the basis of the particular synchronization rules of the multimedia program (S240), if the video at "01 : 03: 06" (hh: mm: ss) from the start of the program is found, the picture at the time often contains the scenery of the Sydney Opera Theater corresponding to the commentary.
  • the synchronization rules for a multimedia program can be varied, and will not be elaborated here.
  • the video contents at the particular position are presented to the user (S250), the pictures of the particular position contain the scenery of the Sydney Opera Theater that the user wants.
  • the user it is possible to present to the user all the contents of the multimedia program, such as, the video/audio, image and text at this particular position; or to present another part of them, for example, the audio only, to the user, to satisfy his individual needs.
  • the presenting process of S250 it is also possible to present the video contents in the periods of time before and/or after this particular site appears.
  • the duration of the period may be fixed a time-value by the user, or be fixed a default value by the system.
  • the user may include a starting position information and a ending position information in the request of the S220, both of which correspond to the particular appearing site expected by the user.
  • the position where the specific word symbol appear in the corresponding position of audio or image streams may be determined according to the synchronization rules. Since video, audio or even image is more complex than text in composition, the processes of analyzing and locating it are also much more complex than those of text. Thus it can be seen that the locating method developed in the present invention is much simpler than that of the prior method through audio/video. In said locating process , if the specific word symbol , such as " Sydney Opera Theater " appears many times in said text stream, when S250 presents the video contents of the particular site to the user, the user is given a chance to choose whether to keep on searching.
  • FIG. 3 is a flow chart of a process for locating contents in a multimedia program and extracting particular segment according to another embodiment of the present invention.
  • a multimedia program is obtained (S310) , which includes a stream provided with word symbol information existing in an image format.
  • word symbol information existing in an image format.
  • its demonstration slides contain word symbol information contents, and exist in an image stream in an image format.
  • Table 1 is a SMIL Script of a multimedia demonstration program including a video stream and an image stream synchronized with the video stream; said image stream includes the demonstration slides and words on the slides, and these words are in an image format.
  • Table 1 A Multimedia Demonstration Program It is seen from Table 1 that said image stream, having a layered structure, contains 9 sections: imagel , Image2 >. image3 > image4 image ⁇ image6 image7> images ⁇ I mageg.
  • Each section corresponds to one slide, that is, each section has its particular starting position and length of continuation for the reason that the video/audio generally change constantly during the demonstration process, and each slide is normally kept unchanged for a period of time. Since it is impossible to directly conduct a textual analysis of the words existing in an image format, a certain means may be used to obtain the text information corresponding to the word symbol information in the image stream (S320). This obtainment step can be performed by the existing Optical Character Recognition (OCR) technology. Then, a request containing specific word symbol is received from the user (S330); the user expects that the specific word symbol exist at one or more position in said multimedia program stream and hopes to find and extract the segments including the specific word symbol through the request.
  • OCR Optical Character Recognition
  • the specific word symbol are searched in the word symbol information of the image stream and it is judged whether the specific word symbol have been found appeared at a particular site (S340). If they are not found, then the user is informed that the specific word symbol are not found in this multimedia program (S344) , and the entire process ends. If they are, then the information appeared at the particular site (S350) is obtained. For example : these specific word symbol appear in the word symbol information of the image2, then the starting position and the duration of image2 are obtained. After that, according to the synchronization rules of the particular multimedia program, determination is made of the corresponding position in the video stream of the site where the words appear (S360). At this time, the starting position and duration of the particular segment of the corresponding video stream are the same as those of image2.
  • the original SMIL Script is modified on the basis of the obtained starting position and duration of the particular segment to obtain a new SMIL Script (S370).
  • This SMIL Script reflects only the segment found, thus making it possible to extract the user needed particular segment from the multimedia program. By selectively performing the modified SMIL Script, the user can directly browse the needed particular segment.
  • S380 further judgment can be made as to whether or not it is necessary to go on with the search (S380). If it is not, then the entire extracting process ends; if it is, the process will return to S340, then go ahead with the search from the last found particular site along the original search direction until a next segment or program the user wants to watch is found.
  • the judgment can be made by automatically judging whether the multimedia program ends or not, or a decision is made by the user by prompting him to do so.
  • said particular text information is also found in image ⁇ and image ⁇ .
  • the modified SMIL Script finally obtained is shown in Table 2.
  • the multimedia program segments corresponding to the SMIL Script contain said particular text information.
  • the stream provided with word symbol information of the multimedia program has a layered structure.
  • the layered structure can be presented as 9 parallel images arranged in sequence like the chapters of a book, that is, the respective layers can be mutually contained.
  • the present invention uses the streams provided with word symbol information contained in the multimedia program to perform the locating and since the analysis of the word symbol information is much simpler than that of the audio/video information, the present invention frees the program producers from a lot of work and reduces the complexity of the work. It enables the users to relatively easily perform locating operation with simpler and less expensive equipment. Furthermore , it also makes it possible to use voice recognition technology to convert dialogues in the audio into the text information to be used for the locating operation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention provides a content positioning method in a multimedia program comprising a stream with word symbol information. The method comprises the steps of receiving a request including specific word symbol from user; determining a present position of said specific word symbol in said stream with word symbol information; and determining other presentable information synchronously with the word symbol information at the appearing position. In comparison with the video information or audio information, a stream with word symbol information contains much smaller amount of data, and the analysis of a word symbol is much simpler, so that the present invention simplifies greatly the complexity of searching program contents, decreases the requirements of hardware, facilitates users' operation and satisfies user's personal requirement.

Description

METHOD AND APPARATUS FOR LOCATING CONTENT IN A PROGRAM
TECHNICAL BACKGROUND The present invention relates to a method and apparatus for locating program contents , particularly to a method and apparatus for locating according to the contents of the multimedia programs. In addition to a video stream and an audio stream , a multimedia program generally contains an image stream and/or a text stream, these streams are synchronized with each other according to particular rules and predetermined time sequence for users to enjoy. Among the numerous multimedia program editing rules, the synchronized multimedia integration language (SMIL) is a popular editing language. The SMIL can not only integrate the respective content streams of a multimedia program in time sequence , but also be used to manage the layout of the multimedia program being presented. While watching a multimedia program, a user sometimes needs to find a particular segment of the program. For example, one needs to find the part on Iraq within a multimedia program of the lecture given by President Bush at Tsinghua University. The user can identify the audio content by fast winding/rewinding the recording media to locate the parts in the program. For another example, a user expects to directly browse a segment on the Sydney Opera Theater in a multimedia recording program depicting Australian scenery. To meet this need , the multimedia playing apparatus should be able to automatically matching analysis the contents of the video streams, so that when the Sydney Opera Theater appears, the related segment is presented to the user. In the process of content locating as described above , if a user performs location manually, he/she has to perform repeatedly the search before finding the desired position of the segment, which would be time- consuming and bothersome. If the user performs the locating by auto- scanning the multimedia playing apparatus, the search will be very difficult due to the complexity and magnitude of video and audio streams, more sophisticated hardware will be needed, which would increase the cost for the user. Additionally, there are various authoring tools , such as the PresenterOne developed by the US Accordent Corporation and the Canadian Presentation Maker developed by the SofTV.net Corporation in the market for the convenience of editing multimedia programs, especially for editing multimedia demonstration programs. These tools allow a user to enlist the titles of text slides of multimedia demonstration, and the user can use these titles as indexes to locate corresponding segments. Although it simplifies the searching process to a certain degree, the above editing tools must be used in making the multimedia demonstration program.
Furthermore, the editing tools provide only a very limited number of titles for users to choose from , which restricts the arbitrary of the user's choices, and renders the user-based choice impossible. Therefore, a new program content locating method and apparatus is needed, which enables users to locate program contents in multimedia programs conveniently so that their individual requirements could be satisfied by obtaining any segments as they want.
SUMMARY OF THE PRESENT INVENTION Therefore, one of the objects of the present invention is providing a new program content locating method and apparatus to overcome the defects of the prior art, which enables users to locate program contents in multimedia programs conveniently to obtain the particular segments as they want. The present invention provides a method for locating content in a multimedia program, which comprising a stream with word symbol information, comprising: firstly receiving a request comprising a specific word symbol from a user; then determining a position where the specific word symbol appears in the stream with word symbol information; and finally determining other presentable information synchronous with the word symbol information at the position. The other presentable information may be video information or audio information. The word symbol information may exist in a text format or image format. When it exists in an image format, the locating method further comprises the step of obtaining the text information corresponding to the word symbol information. The stream provided with word symbol information may have a layered structure. If so, the locating method further comprises the step of determining a layer containing the position where specific word symbol appear and having a particular starting position and a particular end position, so that the other finally determined presentable information has the corresponding start position and end position. The present invention further provides an apparatus for locating contents in a multimedia program, which has a stream provided with word symbol information. The word symbol information may exist in a text format or an image format. The apparatus includes a request receiving means, a word symbol locating means and a synch-locating means. The request receiving means is used to receive a request comprising a specific word symbol from a user; the word symbol locating means is used to determine the position where the specific word symbol appear in the stream provided with word symbol information; and the synch-locating means is used to determine the other presentable information that synchronizes with the word symbol information appearing at the position. The other presentable information may be video information or audio information. The present invention locates the position of a user required segment in a program by analyzing the stream provided with word symbol information which is included in multimedia programs, then finds corresponding video or audio segment according to the synchronization rules. Since the streams provided with word symbol information , such as text streams or image streams, contain much less a quantity of data relative to video or audio, and the analysis of text is also much simpler than that of picture or audio; therefore , the present invention has greatly reduced the complexity of searching program contents, lowered the hardware requirement , made user's operation convenient, and satisfied different needs of individual users. The other objects and advantageous of the present invention will be evident, and the present invention also be better understood through the description and claims made with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS The present invention is explained in detail by way of examples and with reference to the accompanying drawings, in which: Figure 1 is a system block diagram of an apparatus for locating contents in a multimedia program according to an embodiment of the present invention; Figure 2 is a flow chart of the process for locating contents in a multimedia program according to an embodiment of the invention; Figure 3 is a flow chart of the process for locating contents in a multimedia program and extracting particular segments according to another embodiment of the present invention; In all the figures , the same reference numbers indicate similar or identical features and functions. DETAILED DESCRIPTION OF THE EMBODIMENTS Figure 1 shows a system block diagram of an apparatus for locating contents in multimedia programs according to an embodiment of the present invention. The apparatus 100 may be part of a multimedia program making apparatus (not shown in this figure) or a multimedia playing apparatus (not shown in this figure). The apparatus 100 includes a request receiving module 120, a text locating module 130 and a synch-locating module 140. The apparatus 100 further includes a content receiving module 110, a presentation module 150 and an extraction module 160. Said module included in the apparatus 100 can be realized by those skilled in the art by the various existing module as long as their combination can perform the functions of the present invention. The content receiving module 110 is used to receive a multimedia program, which contains a stream provided with word symbol information, such as a text stream or an image stream having the word symbol information (as slides of the auxiliary demonstration tools in existing multimedia demonstration programs, e.g., one page of a PowerPoint file, sometimes transmitted in an image format). The multimedia program may come from a local storage module (not shown in the figure), such as a DVD, or from a web server (not shown in the figure). Request receiving module 120 is used to receive a request, which contains specific word symbol, such as, "Sydney Opera Theater". A user hopes to find with this request in the segment on Sydney Opera Theater in the multimedia program being edited/appreciated. The multimedia program includes a stream provided with word symbol information. Text locating module 130 is used to determine the position of specific word symbol in the multimedia program. Module 130 searches the specific word symbol, such as "Sydney Opera Theater", in the stream provided with word symbol information, and, after the specific word symbol are found, obtains the information on their positions in the programs. If the stream provided with word symbol information is an image stream, the module 130 is further used to obtain the text information corresponding to the word symbol information in the image stream. Synch-locating module 140 is used to determine the other presentable information that synchronizes with the word symbol information appearing at the site. Because of the synchronization in time of different content streams in a multimedia program, it is possible to determine the corresponding positions of a site in the other content streams , such as video streams or audio streams, according to the position information about the site in one content stream, such as a text stream. Presentation module 150 is used to present to the user the program contents in the particular position within a multimedia program. Extracting module 160 is used to extract a particular segment from a multimedia program. In this embodiment , the particular segment may contain the particular text information. For the details of the flow chart showing the operation running module 100, see Figures 2 and 3. Figure 2 is a flow chart of a process for locating contents in a multimedia program according to an embodiment of the present invention.
Firstly, a multimedia program including a stream provided with word symbol information is obtained in step 210 (S210). The word symbol information exists in a text format, for example, in the case of a multimedia digital television program stream, the captions exist in the data stream in a text format; in the case of a multimedia demonstration program stream , the wording contents for the demonstration exist in a text stream in a text format. If the multimedia program is relatively long, this step will not end until the entire locating process ends. In this embodiment, the multimedia program about Australia Scenery is taken as an example. The program includes a text stream carrying corresponding commentary contents. Then, a request containing specific word symbol, such as "Sydney Opera Theater", is received from a user (S230) ; the user expects that specific word symbol exist at certain position in the text stream and hopes to find the segment containing the specific word symbol in the multimedia program obtained in S210. Next, the specific word symbol are searched in the text stream and it is judged whether they have been found appeared at a particular site in the text stream (S230). If they have not been found, then the process informs the user that the specific word symbol are not found in this multimedia program (S234) , and the entire process comes to an end. If they have been found, the process obtains the position information about the site where they appeared (S238 ) , for example, that of the "Sydney Opera
Theater" at "01 : 03: 06" (hh:mm:ss) from the start of the program. After that, the corresponding position of the specific word symbol in the video stream is determined on the basis of the particular synchronization rules of the multimedia program (S240), if the video at "01 : 03: 06" (hh: mm: ss) from the start of the program is found, the picture at the time often contains the scenery of the Sydney Opera Theater corresponding to the commentary. The synchronization rules for a multimedia program can be varied, and will not be elaborated here. Finally, the video contents at the particular position are presented to the user (S250), the pictures of the particular position contain the scenery of the Sydney Opera Theater that the user wants. Of course, it is possible to present to the user all the contents of the multimedia program, such as, the video/audio, image and text at this particular position; or to present another part of them, for example, the audio only, to the user, to satisfy his individual needs. In the presenting process of S250, it is also possible to present the video contents in the periods of time before and/or after this particular site appears. The duration of the period may be fixed a time-value by the user, or be fixed a default value by the system. The user may include a starting position information and a ending position information in the request of the S220, both of which correspond to the particular appearing site expected by the user. Of course , in S240 of this embodiment , the position where the specific word symbol appear in the corresponding position of audio or image streams may be determined according to the synchronization rules. Since video, audio or even image is more complex than text in composition, the processes of analyzing and locating it are also much more complex than those of text. Thus it can be seen that the locating method developed in the present invention is much simpler than that of the prior method through audio/video. In said locating process , if the specific word symbol , such as " Sydney Opera Theater " appears many times in said text stream, when S250 presents the video contents of the particular site to the user, the user is given a chance to choose whether to keep on searching. If he does, then the search will go on along the original search direction from the last found particular site until the user desired scenery is found or until the program ends. Such chance of choice can be provided in the form of a push button on the screen to prompt the user to decide whether or not to go on searching , and then the search ends in answering to the user's input information. Figure 3 is a flow chart of a process for locating contents in a multimedia program and extracting particular segment according to another embodiment of the present invention. Firstly , a multimedia program is obtained (S310) , which includes a stream provided with word symbol information existing in an image format. For example, in respect of a multimedia demonstration program, its demonstration slides contain word symbol information contents, and exist in an image stream in an image format. If the multimedia program is relatively long , this step is one that continues until the entire locating process ends. Table 1 is a SMIL Script of a multimedia demonstration program including a video stream and an image stream synchronized with the video stream; said image stream includes the demonstration slides and words on the slides, and these words are in an image format. Table 1 : A Multimedia Demonstration Program It is seen from Table 1 that said image stream, having a layered structure, contains 9 sections: imagel , Image2 >. image3 > image4 imageδΛ image6 image7> images ^ I mageg. Each section corresponds to one slide, that is, each section has its particular starting position and length of continuation for the reason that the video/audio generally change constantly during the demonstration process, and each slide is normally kept unchanged for a period of time. Since it is impossible to directly conduct a textual analysis of the words existing in an image format, a certain means may be used to obtain the text information corresponding to the word symbol information in the image stream (S320). This obtainment step can be performed by the existing Optical Character Recognition (OCR) technology. Then, a request containing specific word symbol is received from the user (S330); the user expects that the specific word symbol exist at one or more position in said multimedia program stream and hopes to find and extract the segments including the specific word symbol through the request. Next, the specific word symbol are searched in the word symbol information of the image stream and it is judged whether the specific word symbol have been found appeared at a particular site (S340). If they are not found, then the user is informed that the specific word symbol are not found in this multimedia program (S344) , and the entire process ends. If they are, then the information appeared at the particular site (S350) is obtained. For example : these specific word symbol appear in the word symbol information of the image2, then the starting position and the duration of image2 are obtained. After that, according to the synchronization rules of the particular multimedia program, determination is made of the corresponding position in the video stream of the site where the words appear (S360). At this time, the starting position and duration of the particular segment of the corresponding video stream are the same as those of image2. Finally , the original SMIL Script is modified on the basis of the obtained starting position and duration of the particular segment to obtain a new SMIL Script (S370). This SMIL Script reflects only the segment found, thus making it possible to extract the user needed particular segment from the multimedia program. By selectively performing the modified SMIL Script, the user can directly browse the needed particular segment. After the S360, further judgment can be made as to whether or not it is necessary to go on with the search (S380). If it is not, then the entire extracting process ends; if it is, the process will return to S340, then go ahead with the search from the last found particular site along the original search direction until a next segment or program the user wants to watch is found. The judgment can be made by automatically judging whether the multimedia program ends or not, or a decision is made by the user by prompting him to do so. In this embodiment , in addition to said particular text information found in image2, said particular text information is also found in imageδ and imageδ. The modified SMIL Script finally obtained is shown in Table 2.
The multimedia program segments corresponding to the SMIL Script contain said particular text information. Table 2.- A particular segment 1 of a multimedia program Wherein: T1 = t1 , T2 = t1 +t2+t3+t4 T3 = t1 +t2+t3+t4+t5-M6+t7 In this embodiment, the stream provided with word symbol information of the multimedia program has a layered structure. The layered structure can be presented as 9 parallel images arranged in sequence like the chapters of a book, that is, the respective layers can be mutually contained. Since the present invention uses the streams provided with word symbol information contained in the multimedia program to perform the locating and since the analysis of the word symbol information is much simpler than that of the audio/video information, the present invention frees the program producers from a lot of work and reduces the complexity of the work. It enables the users to relatively easily perform locating operation with simpler and less expensive equipment. Furthermore , it also makes it possible to use voice recognition technology to convert dialogues in the audio into the text information to be used for the locating operation. Although the present invention has been described in combination with the specific embodiments, it is obvious for those skilled in the art to make various substitutions, modifications and changes on the basis of the preceding sections. Therefore , these substitutions, modifications and changes, if they do not depart from the spirit and fall within the scope of the following claims, should be included in the present invention.

Claims

WHAT IS CLAIMED IS:
1. A method for locating content in a multimedia program, which comprising a stream with word symbol information, comprising the steps of: a. receiving a request comprising a specific word symbol from a user; b. determining a position where the specific word symbol appears in the stream with word symbol information; and c. determining other presentable information synchronous with the word symbol information at the position.
2. The method according to claim 1 , wherein further comprising the step of: presenting the program content at the position to the user.
3. The method according to claim 1 , wherein the other presentable information comprises at least audio information or video information.
4. The method according to claim 1 , wherein the word symbol information exists in a text format.
5. The method according to claim 4, wherein the other presentable information comprises image.
6. The method according to claim 1 , wherein the word symbol exists in an image format, and further comprising the step of: obtaining text information corresponding to the word symbol information.
7. The method according to claim 1 , wherein the content in the stream with word symbol information has a layered structure, and further comprising the step of: determining a layer containing the position, wherein the layer having a specific start position and a specific end position, thereby the other presentable information determined in step c have a corresponding start position and a corresponding end position.
8. The method according to claim 1 , wherein the request from the user further comprises information of start position and information end position with respect to the position so that the other presentable information determined in step c has a corresponding start position and a corresponding end position.
9. The method according to claim 7 or 8, further comprising the step of: extracting a program segment having the start position and the end position.
10. The method according to claim 9, wherein the multimedia program is integrated via SMIL, wherein extracting step includes extracting the program by modifying the SMIL representation document of the multimedia program.
11. A locating apparatus for locating content in a multimedia program, which comprising a stream with word symbol information, comprising: request receiving means for receiving a request comprising a specific word symbol from a user; word symbol locating means for determining a position where the specific word symbol appears in the stream with word symbol information; and synch-locating means for determining other presentable information synchronous with the word symbol information at the position.
12. The apparatus according to claim 11 , further comprising presenting means for presenting program content at the position to the user.
13. The apparatus according to claim 11 , wherein the other presentable information comprises at least one of audio information or video information.
14. The apparatus according to claim 11 , wherein the word symbol information exists in a text format.
15. The apparatus according to claim 14, wherein the other presentable information comprises image.
16. The apparatus according to claim 11 , wherein the word symbol exists in an image format, and word symbol locating means is arranged to obtain text information corresponding to the word symbol information.
17. The apparatus according to claim 11 , wherein the content in the stream with word symbol has layered structure, and the word symbol locating means is arranged to determine a layer containing the position, wherein the layer having a specific start position and a specific end position, so that the other presentable information determined by the synch-locating means have a corresponding start position and a corresponding end position.
18. The apparatus according to claim 11 , wherein the request from the user further comprises the information of start position and information of end position with respect to the position so that the other presentable information determined by synch-locating means has a corresponding start position and a corresponding end position.
19. An apparatus for playing multimedia programs to a user, comprising: content receiving means for receiving a multimedia program comprising a stream with word symbol; present means for presenting the received multimedia program to the user; and locating means comprising: request receiving means for receiving a request including specific word symbol from user; word symbol locating means for determining an appearance position of the specific word symbol in the stream with the specific word symbol; and synch-locating means for determining other presentable information synchronous with the word symbol at the position.
20. The means according to claim 19, further comprising extracting means for extracting a specific segment from the multimedia program.
PCT/IB2005/050415 2004-02-24 2005-02-01 Method and apparatus for locating content in a program WO2005083592A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2007500321A JP2007525900A (en) 2004-02-24 2005-02-01 Method and apparatus for locating content in a program
EP05702854A EP1723555A1 (en) 2004-02-24 2005-02-01 Method and apparatus for locating content in a program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200410007668.5 2004-02-24
CN2004100076685A CN1662053A (en) 2004-02-24 2004-02-24 Program content positioning method and device

Publications (1)

Publication Number Publication Date
WO2005083592A1 true WO2005083592A1 (en) 2005-09-09

Family

ID=34892100

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2005/050415 WO2005083592A1 (en) 2004-02-24 2005-02-01 Method and apparatus for locating content in a program

Country Status (5)

Country Link
EP (1) EP1723555A1 (en)
JP (1) JP2007525900A (en)
KR (1) KR20070020208A (en)
CN (2) CN1662053A (en)
WO (1) WO2005083592A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226966A (en) * 2013-04-26 2013-07-31 广东欧珀移动通信有限公司 Method capable of quickly positioning playing progress and mobile terminal
CN103605765A (en) * 2013-11-26 2014-02-26 电子科技大学 Mass image retrieval system based on cluster compactness
CN104572714A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 Learning video inquiring system and learning video inquiring method
CN105117407A (en) * 2015-07-27 2015-12-02 电子科技大学 Image retrieval method for cluster-based distance direction histogram
CN107093336A (en) * 2016-09-06 2017-08-25 北京新学堂网络科技有限公司 A kind of preparation method that film is made to reading learning formula strip cartoon
CN108989851A (en) * 2018-08-27 2018-12-11 努比亚技术有限公司 A kind of video broadcasting method, terminal and computer readable storage medium

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075233B (en) * 2006-05-17 2012-05-02 华为技术有限公司 Member, system and method for collecting multi-medium content
US8204955B2 (en) 2007-04-25 2012-06-19 Miovision Technologies Incorporated Method and system for analyzing multimedia content
CN101470710B (en) * 2007-12-27 2011-01-12 Tcl集团股份有限公司 Method for positioning content of multimedia file
CN102955809A (en) * 2011-08-26 2013-03-06 吴志刚 Method and system for editing and playing media files
CN102592628A (en) * 2012-02-15 2012-07-18 张群 Play control method of audio and video play file
CN104572716A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 System and method for playing video files
CN104572712A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 Multimedia file browsing system and multimedia file browsing method
CN105163178B (en) * 2015-08-28 2018-08-07 北京奇艺世纪科技有限公司 A kind of video playing location positioning method and device
CN108062302B (en) * 2016-11-08 2019-03-26 北京国双科技有限公司 A kind of recognition methods of text information and device
CN107340968B (en) * 2017-07-18 2021-03-09 网易传媒科技(北京)有限公司 Method, device and computer-readable storage medium for playing multimedia file based on gesture
CN111339323A (en) * 2020-02-21 2020-06-26 联想(北京)有限公司 Information processing method and device for multimedia file

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5136655A (en) * 1990-03-26 1992-08-04 Hewlett-Pacard Company Method and apparatus for indexing and retrieving audio-video data
US20020056082A1 (en) * 1999-11-17 2002-05-09 Hull Jonathan J. Techniques for receiving information during multimedia presentations and communicating the information
US6430357B1 (en) * 1998-09-22 2002-08-06 Ati International Srl Text data extraction system for interleaved video data streams

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5136655A (en) * 1990-03-26 1992-08-04 Hewlett-Pacard Company Method and apparatus for indexing and retrieving audio-video data
US6430357B1 (en) * 1998-09-22 2002-08-06 Ati International Srl Text data extraction system for interleaved video data streams
US20020056082A1 (en) * 1999-11-17 2002-05-09 Hull Jonathan J. Techniques for receiving information during multimedia presentations and communicating the information

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226966A (en) * 2013-04-26 2013-07-31 广东欧珀移动通信有限公司 Method capable of quickly positioning playing progress and mobile terminal
CN104572714A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 Learning video inquiring system and learning video inquiring method
CN103605765A (en) * 2013-11-26 2014-02-26 电子科技大学 Mass image retrieval system based on cluster compactness
CN103605765B (en) * 2013-11-26 2016-11-16 电子科技大学 A kind of based on the massive image retrieval system clustering compact feature
CN105117407A (en) * 2015-07-27 2015-12-02 电子科技大学 Image retrieval method for cluster-based distance direction histogram
CN107093336A (en) * 2016-09-06 2017-08-25 北京新学堂网络科技有限公司 A kind of preparation method that film is made to reading learning formula strip cartoon
CN108989851A (en) * 2018-08-27 2018-12-11 努比亚技术有限公司 A kind of video broadcasting method, terminal and computer readable storage medium

Also Published As

Publication number Publication date
CN1922610A (en) 2007-02-28
JP2007525900A (en) 2007-09-06
EP1723555A1 (en) 2006-11-22
KR20070020208A (en) 2007-02-20
CN1662053A (en) 2005-08-31

Similar Documents

Publication Publication Date Title
WO2005083592A1 (en) Method and apparatus for locating content in a program
US8374845B2 (en) Retrieving apparatus, retrieving method, and computer program product
JP4550725B2 (en) Video viewing support system
US8799945B2 (en) Information processing apparatus, information processing method, and computer program
TWI358948B (en)
US20200126583A1 (en) Discovering highlights in transcribed source material for rapid multimedia production
US20200126559A1 (en) Creating multi-media from transcript-aligned media recordings
US8413192B2 (en) Video content viewing apparatus
KR20090004990A (en) Internet search-based television
US7904452B2 (en) Information providing server, information providing method, and information providing system
JP2009055152A (en) Motion picture generating apparatus, motion picture generating method, and program
JP2006155384A (en) Video comment input/display method and device, program, and storage medium with program stored
US20070282871A1 (en) Information processing apparatus, method, and program product
US20120066235A1 (en) Content processing device
JP2005064600A (en) Information processing apparatus, information processing method, and program
JP2010161722A (en) Data processing apparatus and method, and program
US20040177317A1 (en) Closed caption navigation
CN101188722A (en) Video recording/reproducing apparatus
US20100083314A1 (en) Information processing apparatus, information acquisition method, recording medium recording information acquisition program, and information retrieval system
US20080005100A1 (en) Multimedia system and multimedia search engine relating thereto
CN114268829B (en) Video processing method, video processing device, electronic equipment and computer readable storage medium
JP2006186426A (en) Information retrieval display apparatus, information retrieval display method, and information retrieval display program
JP2005092295A (en) Meta information generating method and device, retrieval method and device
KR102252522B1 (en) Method and system for automatic creating contents list of video based on information
US20080016068A1 (en) Media-personality information search system, media-personality information acquiring apparatus, media-personality information search apparatus, and method and program therefor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2005702854

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007500321

Country of ref document: JP

Ref document number: 200580005730.X

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 1020067017079

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2005702854

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020067017079

Country of ref document: KR

WWW Wipo information: withdrawn in national office

Ref document number: 2005702854

Country of ref document: EP